linux:parallel_rsync
Table of Contents
[SCRIPT] psync (parallel rsync)
Description
This set of scripts will parallelize the transfer of a huge directory tree keeping in mind a maximum number of simultaneous transfers.
Instructions
I suggest you tu launch psync with the following line:
./psync.sh /path/to/folder
Don't launch it with the FINAL SLASH:
- NOP:
./psync.sh /path/to/folder/ - YES: ./psync.sh /path/to/folder
Pre-Reqs
- gnu screen
- rsync
- ssh
psync.sh
Description
This script will:
- Check if the directory to transfer exists
- Calculate the directories to transfer at the maximum deep of ${MAXDEPTH}
- Parallel Transfer of the upper directories from deep 1 to deep ${MAXDEPTH} (It will show a message each 100 directories)
- Parallel transfer of the directories at deep ${MAXDEPTH} (It will show a message for each folder)
- Think that the ${MAXPARALEL} is flexible because of the “sleep 1” in the “checkmaxprocesses()” function.
Code
- psync.sh
#!/bin/bash [ ! $1 ] && echo "Usage: $0 /path/to/run" && exit 1 TARGET="$1" [[ ! "${TARGET}" ]] && echo -e "$TARGET\n not a directory" && exit 1 [ ! -d ${TARGET} ] && echo -e "$TARGET\n not a directory" && exit 1 LOGDIR=$(dirname $0)/$(basename ${TARGET}) [ -d ${LOGDIR} ] && echo "Cleanup" && rm -fr ${LOGDIR} mkdir -p ${LOGDIR}/transferlogs check_max_processes() { local let MAXPARALEL=$1 while [ $(ps waux | egrep ":[0-9]{2} rsync" | wc -l) -gt ${MAXPARALEL} ] ; do printf "%s" . sleep 1 done } sync_this() { local let MAXDEPTH=3 local let MAXPARALEL=20 LAUCHRSYNC="$(dirname $0)/launch_rsync.sh" local let y=0 for FOLDER in $(find ${TARGET} -mindepth ${MAXDEPTH} -maxdepth ${MAXDEPTH} -type d) ; do DIRLIST[$y]="${FOLDER}" let y++ done echo "Copying files and directories NOT recursively" for ((i=0;i<${MAXDEPTH}; i++));do let x=0 for ITEM in $(find ${TARGET} -mindepth $i -maxdepth $i -type d) ; do check_max_processes ${MAXPARALEL} screen -S ${x} -d -m ${LAUCHRSYNC} -nr ${ITEM} nr_${x} ${LOGDIR} let x++ [[ $x =~ [0-9]{1,2}00$ ]] && printf "\n%s\n" "$x Directories Copied Not recursively" done echo "Deep $i DONE, going upper" done echo "Launching recursive rsyncs in deep ${MAXDEPTH}" let x=0 for ((i=0;i<${#DIRLIST[@]}; i++ )); do printf "\n%s" "Launching rsync $i of ${#DIRLIST[@]}" check_max_processes ${MAXPARALEL} screen -S ${i} -d -m ${LAUCHRSYNC} -r ${DIRLIST[$i]} r_${i} ${LOGDIR} done } sync_this ${TARGET}
Script Variables
Variable | Description |
---|---|
TARGET="$1" | De directory that will be transferred |
LOGDIR=$(dirname $0)/$(basename ${TARGET}) | The directory in will you'll find the ressults of the sync's |
local let MAXDEPTH=3 | The deep in which the script will parallelize the sync. |
let MAXPARALEL=20 | Maximum number of rsync's launched at a time |
LAUCHRSYNC="/root/autosync/launch_rsync.sh" | The rsync script itself |
launch_rsync.sh
Description
This script will:
- Launch rsync non-parallel or parallel
- Log the exit code of rsync to know if everything gones fine or not
Code
- launch_rsync.sh
#!/bin/bash # launch_rsync.sh RECURSIVE=$(echo $1 | tr '[[:upper:]]' '[[:lower:]]') TARGET=$2 SCREENNAME=$3 LOGDIR=$4 DSTSERVER="1.1.1.1" DESTINATION="${TARGET}" if [[ "${RECURSIVE}" =~ ^\-{1,2}(nr|non-recursive)$ ]] ; then rsync -cdlptgoDv --partial ${TARGET}/* ${DSTSERVER}:${DESTINATION}/ 2>&1 > ${LOGDIR}/transferlogs/${SCREENNAME}_NOTRECURSIVE.log RES=$? elif [[ "${RECURSIVE}" =~ ^\-{1,2}(r|recursive)$ ]] ; then rsync -cazv --partial ${TARGET}/* ${DSTSERVER}:${DESTINATION}/ 2>&1 > ${LOGDIR}/transferlogs/${SCREENNAME}.log RES=$? else echo "$0 -nr|-r|--non-recursive|--recursive" exit 1 fi if [ $RES -eq 0 ] ; then echo "$RES : ${TARGET}" >> ${LOGDIR}/${RECURSIVE//-/}_TRANSFERS.OK else echo "$RES : ${TARGET}" >> ${LOGDIR}/${RECURSIVE//-/}_TRANSFERS.FAIL fi
Variables
Variable | Description |
---|---|
RECURSIVE=$(echo $1 | tr '[[:upper:]]' '[[:lower:]]') | Parallel or not, DON'T MODIFY |
TARGET=$2 | The directory that will be transferred, DON'T MODIFY |
SCREENNAME=$3 | Name of the screen in which that script is running, DON'T MODIFY |
LOGDIR=$4 | Where the ressults will be logged, DON'T MODIFY |
DSTSERVER="1.1.1.1" | Destination server |
DESTINATION="${TARGET}" | Destination folder, actually is the same of ${TARGET}, but you will wish to modify it |
linux/parallel_rsync.txt · Last modified: 2022/02/11 11:36 by 127.0.0.1