Shift 8.0

master
Paul Kolano 2021-09-16 17:26:08 -07:00
parent ffb7f5fa1f
commit 1fbeb53e3b
9 changed files with 623 additions and 780 deletions

19
CHANGES
View File

@ -341,3 +341,22 @@ CHANGES
- Fixed manager exception during read of corrupt/incomplete gzi files - Fixed manager exception during read of corrupt/incomplete gzi files
- Fixed handling of manager disk exhaustion to prevent finalized metadata - Fixed handling of manager disk exhaustion to prevent finalized metadata
- Fixed potential divide by zero exceptions during throttling - Fixed potential divide by zero exceptions during throttling
- Fixed lustre handling in shift-bin to current lustre API
* Shift 8.0 (09/16/21)
- Note that metadata is not backward compatible with previous versions
- Added plots by tool, subnet, and batch size
- Added detection of BeeGFS file systems
- Changed plots from line to heatmap for improved scalability
- Changed processing so batches only contain single operation type
- Changed monitoring to support manager hosts across shared file system
- Changed naming of built-in tools in --stats and new --plot by tool
- Fixed detection of DMF file systems when using xdsm mount option
- Fixed detection of GPFS file systems when mmlsmgr not user-accessible
- Fixed --wait status retrieval when integrated with Mesh framework
- Fixed shift-bin striping with lustre progressive layouts when --stripe=0
- Fixed use of XS Data::MessagePack when embedded pure perl version differs
- Fixed scalability of built-in yEnc encoding/decoding
- Fixed unnecessary metadata traversal when showing detailed status
- Fixed application of lustre striping expression to directories
- Removed bbcp and gridftp support since reported fatal bugs never fixed

View File

@ -63,15 +63,10 @@ Shift Installation and Configuration
(https://metacpan.org/pod/Data::MessagePack) (https://metacpan.org/pod/Data::MessagePack)
o IO::Socket::SSL - allows fish-tcp encryption with --secure o IO::Socket::SSL - allows fish-tcp encryption with --secure
(https://metacpan.org/pod/IO::Socket::SSL) (https://metacpan.org/pod/IO::Socket::SSL)
o bbcp - high speed remote copy
(http://www.slac.stanford.edu/~abh/bbcp)
(note latest v17.12.00.00.0 fails when overwriting existing files)
o bbftp - high speed remote copy o bbftp - high speed remote copy
(http://doc.in2p3.fr/bbftp) (http://doc.in2p3.fr/bbftp)
o gnuplot >= 5.0 - allow display of --plot output o gnuplot >= 5.0 - allow display of --plot output
(http://gnuplot.info) (http://gnuplot.info)
o gridftp - high speed remote copy
(http://toolkit.globus.org/toolkit/data/gridftp)
o mcp/msum >= 1.76.7 - high speed local copy/sum o mcp/msum >= 1.76.7 - high speed local copy/sum
(http://mutil.sf.net) (http://mutil.sf.net)
o mesh - lightweight single sign-on via ssh publickey authentication o mesh - lightweight single sign-on via ssh publickey authentication
@ -97,7 +92,6 @@ Shift Installation and Configuration
o su - become non-root process to access manager during root transfers o su - become non-root process to access manager during root transfers
o sysctl - determine number of cpus on BSD o sysctl - determine number of cpus on BSD
o touch - change symlink modification time o touch - change symlink modification time
o unbuffer - interleave stdout/stderr when using gridftp
3. Build (optional - linux only) 3. Build (optional - linux only)

View File

@ -59,8 +59,7 @@ Shift includes the following features, among others:
- fully self-contained besides perl core and ssh - fully self-contained besides perl core and ssh
- automatic detection and selection of higher performance transports and - automatic detection and selection of higher performance transports and
hash utilities when available including bbcp, bbftp, gridftp, mcp, msum, hash utilities when available including bbftp, mcp, msum, and rsync
and rsync
- automatic many-to-many parallelization of single and multi-file - automatic many-to-many parallelization of single and multi-file
transfers with file system equivalence detection and rewriting transfers with file system equivalence detection and rewriting

View File

@ -1,5 +1,5 @@
// //
// Copyright (C) 2012-2020 United States Government as represented by the // Copyright (C) 2012-2021 United States Government as represented by the
// Administrator of the National Aeronautics and Space Administration // Administrator of the National Aeronautics and Space Administration
// (NASA). All Rights Reserved. // (NASA). All Rights Reserved.
// //
@ -252,6 +252,15 @@ int do_getstripe(char *file) {
////////////////////// //////////////////////
int do_setstripe(char *file, int scount, unsigned long long ssize, char *pool) { int do_setstripe(char *file, int scount, unsigned long long ssize, char *pool) {
#ifndef _NO_LUSTRE #ifndef _NO_LUSTRE
if (scount == 0 && ssize == 0) {
FILE *rc = fopen(file, "w");
if (rc != NULL) {
fclose(rc);
return 0;
} else {
return -1;
}
}
return llapi_file_create_pool(file, ssize, -1, scount, 0, pool); return llapi_file_create_pool(file, ssize, -1, scount, 0, pool);
#else #else
return -1; return -1;

View File

@ -58,8 +58,7 @@ automatic striping of files transferred to Lustre file systems
fully self-contained besides perl core and ssh fully self-contained besides perl core and ssh
.IP - .IP -
automatic detection and selection of higher performance transports and automatic detection and selection of higher performance transports and
hash utilities when available including bbcp, bbftp, gridftp, mcp, hash utilities when available including bbftp, mcp, msum, and rsync
msum, and rsync
.IP - .IP -
automatic many-to-many parallelization of single and multi-file automatic many-to-many parallelization of single and multi-file
transfers with file system equivalence detection and rewriting transfers with file system equivalence detection and rewriting
@ -130,10 +129,11 @@ given in following sections.
\-\-mgr\-user=USER access manager host as USER \-\-mgr\-user=USER access manager host as USER
\-\-monitor[=FORMAT] monitor progress of running transfers \-\-monitor[=FORMAT] monitor progress of running transfers
(FORMAT one of {color,csv,pad}) (FORMAT one of {color,csv,pad})
\-\-plot[=[BY:]LIST] plot detailed performance when piped to gnuplot \-\-plot[=[BY[:/]]LIST] plot detailed performance when piped to gnuplot
(BY one of {client,host,id,user}) (BY one of {client,fs,host,id,net,user})
(LIST subset of {chattr,cksum,cp,find,io,ln,meta, (LIST subset of {bbftp,chattr,cksum,cp,find,fish,fish-tcp,
mkdir,sum}) io,ln,mcp,meta, mkdir,msum,rsync,shift-cp,
shift-sum,sum,tool})
\-\-restart[=ignore] restart transfer with given \-\-id [ignoring errors] \-\-restart[=ignore] restart transfer with given \-\-id [ignoring errors]
\-\-search=REGEX show only status/history matching REGEX \-\-search=REGEX show only status/history matching REGEX
\-\-state=STATE show status of only those operations in STATE \-\-state=STATE show status of only those operations in STATE
@ -153,12 +153,10 @@ given in following sections.
(use suffix {k,m,b/g,t} for 1E{3,6,9,12}) [1k] (use suffix {k,m,b/g,t} for 1E{3,6,9,12}) [1k]
\-\-interval=NUM adjust batches to run for around NUM seconds [30] \-\-interval=NUM adjust batches to run for around NUM seconds [30]
\-\-local=LIST set local transport mechanism to one of LIST \-\-local=LIST set local transport mechanism to one of LIST
(LIST subset of {bbcp,bbftp,fish,fish-tcp,gridftp, (LIST subset of {bbftp,fish,fish-tcp,mcp,rsync,shift})
mcp,rsync,shift})
\-\-preallocate=NUM preallocate files when sparsity under NUM percent \-\-preallocate=NUM preallocate files when sparsity under NUM percent
\-\-remote=LIST set remote transport mechanism to one of LIST \-\-remote=LIST set remote transport mechanism to one of LIST
(LIST subset of {bbcp,bbftp,fish,fish-tcp,gridftp, (LIST subset of {bbftp,fish,fish-tcp,rsync,shift})
rsync,shift})
\-\-retry=NUM retry failed operations up to NUM times [2] \-\-retry=NUM retry failed operations up to NUM times [2]
\-\-size=SIZE process transfer in batches of at least SIZE bytes \-\-size=SIZE process transfer in batches of at least SIZE bytes
(use suffix {k,m,g,t} for {KB,MB,GB,TB}) [4g] (use suffix {k,m,g,t} for {KB,MB,GB,TB}) [4g]
@ -356,10 +354,9 @@ there is actually a need to process destination files during the
transfer. transfer.
.IP "\fB\-\-ports=NUM1:NUM2\fP" .IP "\fB\-\-ports=NUM1:NUM2\fP"
Use ports from the range NUM1-NUM2 for the data streams of TCP-based Use ports from the range NUM1-NUM2 for the data streams of TCP-based
transports (currently, bbcp, bbftp, fish-tcp, and gridftp). All transports (currently, bbftp and fish-tcp). All connections
connections originate from the client host so the given port range must originate from the client host so the given port range must be allowed
be allowed on the network path to the remote host and by the remote host on the network path to the remote host and by the remote host itself.
itself.
.IP "\fB\-R, \-r, \-\-recursive\fP" .IP "\fB\-R, \-r, \-\-recursive\fP"
Transfer directories recursively. This option implies Transfer directories recursively. This option implies
\fB\-\-no\-dereference\fP.Note that any symbolic links pointing \fB\-\-no\-dereference\fP.Note that any symbolic links pointing
@ -521,24 +518,39 @@ coloring, respectively. When \fB\-\-id\fP is specified, only the given
transfer will be shown. When all transfers (or the one specified) transfer will be shown. When all transfers (or the one specified)
have completed, the command will exit. This option may be used with have completed, the command will exit. This option may be used with
\fB\-\-wait\fP to monitor progress while waiting. \fB\-\-wait\fP to monitor progress while waiting.
.IP "\fB\-\-plot=[=[BY:]LIST]\fP" .IP "\fB\-\-plot=[=[BY[:/]]LIST]\fP"
Produce output suitable for piping into gnuplot (version 5 or above) Produce output suitable for piping into gnuplot (version 5 or above)
that shows detailed performance over time across all transfers. The that shows detailed performance over time across all transfers. The
\fB\-\-id\fP and \fB\-\-state\fP options may be used to plot only a \fB\-\-id\fP and \fB\-\-state\fP options may be used to plot only a
single transfer or transfers in a particular state, respectively. The single transfer or transfers in a particular state, respectively. The
default plot will show the aggregate performance of each I/O operation default plot will show the aggregate performance of each I/O operation
(i.e. cp, sum, and cksum) and the aggregate performance of each metadata (i.e. cp, sum, and cksum) and the aggregate performance of each metadata
operation (i.e. find, mkdir, ln, and chattr). I/O operations are operation (i.e. find, mkdir, ln, and chattr) across all of the user's
plotted against the left y-axis while metadata operations are plotted transfers. Operations and/or additional groupings are shown on the
against the right y-axis. The list of plotted items may be changed by left y-axis axis across time on the x-axis with heat-based coloring
giving a comma-separated list consisting of one of more of {chattr, indicating MB/s for I/O operations or operations per second for metadata
cksum, cp, find, io, ln, meta, mkdir, sum}. Note that "io" is a operations. In addition, aggregate I/O and metadata performance will be
shorthand for "cp,sum,cksum" and "meta" is a shorthand for shown as an overlayed point plot with green and blue points,
"find,mkdir,ln,chattr". The list of items may be grouped by any of respectively.
{host, id, user} by prefixing one of these terms to the list. For .IP
example, \fB\-\-plot=id:cp\fP would show a curve for the copy The list of plotted items may be changed by giving a comma-separated
performance of each tranfer id. When a grouping is given without a list consisting of one or more of the stages {chattr, cksum, cp, find,
specific list of metrics (e.g. \fB\-\-plot=id\fP), "io" is assumed. io, ln, meta, mkdir, sum} and/or one or more of the tools {bbftp, fish,
fish-tcp, mcp, msum, rsync, shift-cp, shift-sum}. Note that "io" is a
shorthand for "cp,sum,cksum", "meta" is a shorthand for
"find,mkdir,ln,chattr", and "tool" is a shorthand for
"bbftp,fish,fish-tcp,mcp,msum,rsync,shift-cp,shift-sum".
.IP
The list of items may be grouped by any of {client, fs, host, id, net,
user} by prefixing one of these terms to the list. For example,
\fB\-\-plot=id:cp\fP would show a plot of the copy performance achieved
by each transfer id. When a grouping is given without a specific list
of metrics (e.g. \fB\-\-plot=id\fP), "io" is assumed. When a slash "/"
is used instead of colon ":", a heatmap-based bubble plot will be
created with the size of each circle indicating the relative size of the
batch of operations. For example, \fB\-\-plot=fs/tool\fP would show a
plot of the performance that each tool achieved on each file system
with relative batch size.
.IP "\fB\-\-restart[=ignore]\fP" .IP "\fB\-\-restart[=ignore]\fP"
Restart the transfer associated with the given \fB\-\-id\fP that was Restart the transfer associated with the given \fB\-\-id\fP that was
stopped due to unrecoverable errors or stopped explicitly via stopped due to unrecoverable errors or stopped explicitly via
@ -623,12 +635,12 @@ Some advanced options are available to tune various aspects of shiftc
behavior. These options are not needed by most users. behavior. These options are not needed by most users.
.IP "\fB\-\-bandwidth=BITS\fP" .IP "\fB\-\-bandwidth=BITS\fP"
Choose the TCP window size and number of TCP streams of TCP-based Choose the TCP window size and number of TCP streams of TCP-based
transports (currently, bbcp, bbftp, fish-tcp, and gridftp) based on transports (currently, bbftp and fish-tcp) based on the given bits per
the given bits per second. The suffixes k, m, g, and t may be used for second. The suffixes k, m, g, and t may be used for Kb, Mb, Gb, and Tb,
Kb, Mb, Gb, and Tb, respectively. The default bandwidth is estimated to respectively. The default bandwidth is estimated to be 10 Gb/s if a 10
be 10 Gb/s if a 10 GE adapter is found on the client host, 1 Gb/s if the GE adapter is found on the client host, 1 Gb/s if the client host can be
client host can be resolved to an organization domain (by default, one resolved to an organization domain (by default, one of the six original
of the six original generic top-level domains), and 100 Mb/s otherwise. generic top-level domains), and 100 Mb/s otherwise.
.IP "\fB\-\-buffer=SIZE\fP" .IP "\fB\-\-buffer=SIZE\fP"
Use memory buffer(s) of the given size when configurable in the Use memory buffer(s) of the given size when configurable in the
underlying tranport being utilized (currently, all but rsync). The underlying tranport being utilized (currently, all but rsync). The
@ -662,13 +674,13 @@ for manager locks. To make batch selection completely static, use
.IP "\fB\-\-local=LIST\fP" .IP "\fB\-\-local=LIST\fP"
Specify one or more local transports to be used for the transfer in Specify one or more local transports to be used for the transfer in
order of preference, separated by commas. Valid transports for this order of preference, separated by commas. Valid transports for this
option currently include bbcp, bbftp, cp, fish, fish-tcp, gridftp, option currently include bbftp, cp, fish, fish-tcp, mcp, and rsync.
mcp, and rsync. Note that the given transport(s) will be given Note that the given transport(s) will be given priority, but may not be
priority, but may not be used in some cases (e.g. rsync is not capable used in some cases (e.g. rsync is not capable of transferring a specific
of transferring a specific portion of a file as needed by verification portion of a file as needed by verification mode). In such cases, the
mode). In such cases, the default transport based on File::Copy will be default transport based on File::Copy will be used. The tool actually
used. The tool actually used for each file operation can be shown using used for each file operation can be shown using \fB\-\-status\fP with
\fB\-\-status\fP with \fB\-\-id\fP set to the given transfer identifier. \fB\-\-id\fP set to the given transfer identifier.
.IP "\fB\-\-preallocate=NUM\fP" .IP "\fB\-\-preallocate=NUM\fP"
Preallocate files when their sparsity is under the given percent, where Preallocate files when their sparsity is under the given percent, where
sparsity is defined as the number of bytes a file takes up on disk sparsity is defined as the number of bytes a file takes up on disk
@ -681,14 +693,13 @@ transport due to their use of temporary files.
.IP "\fB\-\-remote=LIST\fP" .IP "\fB\-\-remote=LIST\fP"
Specify one or more remote transports to be used for the transfer in Specify one or more remote transports to be used for the transfer in
order of preference, separated by commas. Valid transports for this order of preference, separated by commas. Valid transports for this
option currently include bbcp, bbftp, fish, fish-tcp, gridftp, rsync, option currently include bbftp, fish, fish-tcp, rsync, and sftp. Note
and sftp. Note that the given transport(s) will be given priority, but that the given transport(s) will be given priority, but may not be used
may not be used in some cases (e.g. bbftp is not capable of transferring in some cases (e.g. bbftp is not capable of transferring files with
files with spaces in their names and is also incompatible with spaces in their names and is also incompatible with \fB\-\-secure\fP).
\fB\-\-secure\fP). In such cases, the default transport based on sftp In such cases, the default transport based on sftp will be used. The
will be used. The tool actually used for each file operation can be tool actually used for each file operation can be shown using
shown using \fB\-\-status\fP with \fB\-\-id\fP set to the given transfer \fB\-\-status\fP with \fB\-\-id\fP set to the given transfer identifier.
identifier.
.IP "\fB\-\-retry=NUM\fP" .IP "\fB\-\-retry=NUM\fP"
Retry operations deemed recoverable up to the given number of attempts Retry operations deemed recoverable up to the given number of attempts
per file. The default number of retries is 2. A value of zero disables per file. The default number of retries is 2. A value of zero disables
@ -733,13 +744,13 @@ resulting tar files may still be larger than specified when source files
exist that are larger than the given size. exist that are larger than the given size.
.IP "\fB\-\-streams=NUM\fP" .IP "\fB\-\-streams=NUM\fP"
Use the given number of TCP streams in TCP-based transports (currently, Use the given number of TCP streams in TCP-based transports (currently,
bbcp, bbftp, fish-tcp, and gridftp). The default is the number of bbftp and fish-tcp). The default is the number of streams necessary
streams necessary to fully utilize the specified/estimated bandwidth to fully utilize the specified/estimated bandwidth using the maximum TCP
using the maximum TCP window size. Note that it is usually preferable window size. Note that it is usually preferable to specify
to specify \fB\-\-bandwidth\fP, which allows an appropriate number of \fB\-\-bandwidth\fP, which allows an appropriate number of streams to be
streams to be set automatically. Increasing the number of streams can set automatically. Increasing the number of streams can increase
increase performance when the maximum window size is set too low or performance when the maximum window size is set too low or there is
there is cross-traffic on the network, but too high a value can decrease cross-traffic on the network, but too high a value can decrease
performance due to increased congestion and packet loss. performance due to increased congestion and packet loss.
.IP "\fB\-\-stripe=[CEXP][::[SEXP][::PEXP]]\fP" .IP "\fB\-\-stripe=[CEXP][::[SEXP][::PEXP]]\fP"
By default, a file transferred to a Lustre file system will be striped By default, a file transferred to a Lustre file system will be striped
@ -797,15 +808,14 @@ to 33%, but does not allow bits corrupted during the initial read to be
detected. detected.
.IP "\fB\-\-window=SIZE\fP" .IP "\fB\-\-window=SIZE\fP"
Use a TCP send/receive window of the given size in TCP-based transports Use a TCP send/receive window of the given size in TCP-based transports
(currently, bbcp, bbftp, fish-tcp, and gridftp). The suffixes k, m, (currently, bbftp and fish-tcp). The suffixes k, m, g, and t may be
g, and t may be used for KB, MB, GB, and TB, respectively. The default used for KB, MB, GB, and TB, respectively. The default is the product
is the product of the specified/estimated bandwidth and the round-trip of the specified/estimated bandwidth and the round-trip time between
time between source and destination. Note that it is usually preferable source and destination. Note that it is usually preferable to specify
to specify \fB\-\-bandwidth\fP, which allows an appropriate window size \fB\-\-bandwidth\fP, which allows an appropriate window size to be set
to be set automatically. Increasing the window size allows TCP to automatically. Increasing the window size allows TCP to operate more
operate more efficiently over high bandwidth and/or high latency efficiently over high bandwidth and/or high latency networks, but too
networks, but too high a value can overrun the receiver and cause packet high a value can overrun the receiver and cause packet loss.
loss.
./"################################################################ ./"################################################################
.SH "TRANSFER THROTTLING" .SH "TRANSFER THROTTLING"
./"################################################################ ./"################################################################
@ -1142,5 +1152,5 @@ shiftc was written by Paul Kolano.
./"################################################################ ./"################################################################
.SH "SEE ALSO" .SH "SEE ALSO"
./"################################################################ ./"################################################################
bbcp(1), bbftp(1), cp(1), Date::Parse(3), globus-url-copy(1), mcp(1), bbftp(1), cp(1), Date::Parse(3), mcp(1), msum(1), perlre(1),
msum(1), perlre(1), perlsyn(1), rsync(1), scp(1), sftp(1) perlsyn(1), rsync(1), scp(1), sftp(1)

View File

@ -78,20 +78,12 @@ user_dir /home/%u/.shift
# (example: mcp,shift,fish,fish-tcp,rsync,bbftp) # (example: mcp,shift,fish,fish-tcp,rsync,bbftp)
#local_small shift,fish,fish-tcp #local_small shift,fish,fish-tcp
# command-line options that will be used by bbcp on client hosts
# (example: opts_bbcp -s 4 -w 4194304)
#opts_bbcp nodefault
# behavior commands that will be used by bbftp on client hosts # behavior commands that will be used by bbftp on client hosts
# (see the "behavior commands" section of bbftp man page for details) # (see the "behavior commands" section of bbftp man page for details)
# (options must be separated by "\n" as shown in example below) # (options must be separated by "\n" as shown in example below)
# (example: opts_bbftp setnbstream 4\nsetrecvwinsize 4096\nsetsendwinsize 4096) # (example: opts_bbftp setnbstream 4\nsetrecvwinsize 4096\nsetsendwinsize 4096)
#opts_bbftp nodefault #opts_bbftp nodefault
# command-line options that will be used by globus-url-copy on client hosts
# (example: opts_gridftp -p 4 -tcp-bs 4194304)
#opts_gridftp nodefault
# command-line options that will be used by mcp on client hosts # command-line options that will be used by mcp on client hosts
# (if mcp >= 1.822.1, a --preallocate setting is recommended for DMF on CXFS) # (if mcp >= 1.822.1, a --preallocate setting is recommended for DMF on CXFS)
#opts_mcp --double-buffer #opts_mcp --double-buffer

View File

@ -66,7 +66,7 @@ use Symbol qw(gensym);
use Sys::Hostname; use Sys::Hostname;
use Text::ParseWords; use Text::ParseWords;
our $VERSION = 7.05; our $VERSION = 8.0;
# do not die when receiving sigpipe # do not die when receiving sigpipe
$SIG{PIPE} = 'IGNORE'; $SIG{PIPE} = 'IGNORE';
@ -1260,7 +1260,7 @@ sub mount {
$mnt{opts} = /[\(,]ro[\),]/ ? "ro" : "rw"; $mnt{opts} = /[\(,]ro[\),]/ ? "ro" : "rw";
# acl support is the default unless explicitly disabled # acl support is the default unless explicitly disabled
$mnt{opts} .= ",acl" if (/[\(,]acl[\),]/ || $acl && !/[\(,]noacl[\),]/); $mnt{opts} .= ",acl" if (/[\(,]acl[\),]/ || $acl && !/[\(,]noacl[\),]/);
$mnt{opts} .= ",dmf" if (/[\(,]dm(ap)?i[\),]/); $mnt{opts} .= ",dmf" if (/[\(,](dmapi|dmi|xdsm)[\),]/);
$mnt{opts} .= ",xattr" if (/[\(,]user_xattr[\),]/); $mnt{opts} .= ",xattr" if (/[\(,]user_xattr[\),]/);
#TODO: need to escape local and remote? #TODO: need to escape local and remote?
(my $dev, $mnt{local}, my $type) = ($1, $2, $3) (my $dev, $mnt{local}, my $type) = ($1, $2, $3)
@ -1291,17 +1291,19 @@ sub mount {
$mnt{servers} =~ s/@\w*//g; $mnt{servers} =~ s/@\w*//g;
$mnt{servers} = join("|", map {$_ = fqdn($_)} split(/:/, $mnt{servers})); $mnt{servers} = join("|", map {$_ = fqdn($_)} split(/:/, $mnt{servers}));
} elsif ($type eq 'gpfs') { } elsif ($type eq 'gpfs') {
# gpfs servers do not appear in mount output so call mmlsmgr # gpfs servers do not appear in mount output so read config
my $srv = open3_get([-1, undef, -1], "mmlsmgr $dev"); if (open(FILE, "/var/mmfs/gen/mmfs.cfg")) {
# try a default location if not in path while (<FILE>) {
$srv = open3_get([-1, undef, -1], s/^\s+|\s+$//g;
"/usr/lpp/mmfs/bin/mmlsmgr $dev") if (!$srv); if (/^clustername\s+(\S+)/i) {
next if (!defined $srv); $mnt{servers} = $1;
# output is file system then server ip address $mnt{remote} = "/" . $mnt{servers};
if ($srv =~ /^(\w+)\s+(\d+\.\d+\.\d+\.\d+)/m) { last;
$mnt{remote} = "/$1"; }
$mnt{servers} = fqdn($2); }
close FILE;
} }
next if (!$mnt{servers});
} elsif ($mnt{opts} =~ /,dmf/) { } elsif ($mnt{opts} =~ /,dmf/) {
# always report dmf file systems even if local # always report dmf file systems even if local
$mnt{servers} = $mnt{host}; $mnt{servers} = $mnt{host};

File diff suppressed because it is too large Load Diff

View File

@ -83,7 +83,7 @@ use constant SFTP_TRUNC => 0x10;
use constant SFTP_WRITE => 0x02; use constant SFTP_WRITE => 0x02;
use constant SFTP_EXCL => 0x20; use constant SFTP_EXCL => 0x20;
our $VERSION = 7.08; our $VERSION = 8.0;
$Data::Dumper::Indent = 0; $Data::Dumper::Indent = 0;
$Data::Dumper::Purity = 1; $Data::Dumper::Purity = 1;
@ -233,7 +233,7 @@ if ($opts{h}) {
if (scalar(@ARGV) > 0) { if (scalar(@ARGV) > 0) {
# ignore command if key generation forced and no arguments given # ignore command if key generation forced and no arguments given
my $cmd = shift @ARGV; my $cmd = shift @ARGV;
if ($cmd !~ /(?:^|\W)(?:bbcp|bbftp|bbscp|globus-url-copy|mesh-keykill|mesh-keytime|pcp\+|rm|rsync|scp|sftp|shiftc?|ssh|ssh-balance)$/ && $< != 0) { if ($cmd !~ /(?:^|\W)(?:bbcp|bbftp|bbscp|mesh-keykill|mesh-keytime|pcp\+|rm|rsync|scp|sftp|shiftc?|ssh|ssh-balance)$/ && $< != 0) {
# resolve all symlinks to support links to host:/path in VFS # resolve all symlinks to support links to host:/path in VFS
# (exclude rm so linked targets are not removed) # (exclude rm so linked targets are not removed)
# (exclude root to prevent unintended exposure/modification) # (exclude root to prevent unintended exposure/modification)
@ -245,7 +245,7 @@ if (scalar(@ARGV) > 0) {
if ($cmd =~ /(?:^|\W)pwd$/) { if ($cmd =~ /(?:^|\W)pwd$/) {
print "$ENV{PWD}\n"; print "$ENV{PWD}\n";
exit; exit;
} elsif ($cmd !~ /(?:^|\W)(?:bbcp|bbftp|bbscp|globus-url-copy|mesh-keykill|mesh-keytime|pcp\+|rsync|scp|sftp|ssh|ssh-balance)$/ && } elsif ($cmd !~ /(?:^|\W)(?:bbcp|bbftp|bbscp|mesh-keykill|mesh-keytime|pcp\+|rsync|scp|sftp|ssh|ssh-balance)$/ &&
!$argv_hostpath && (!hostpath($ENV{PWD}) || !$argv_hostpath && (!hostpath($ENV{PWD}) ||
grep(!/^[-\/]/, @ARGV) == 0 || $cmd =~ /(?:^|\W)complete$/ && grep(!/^[-\/]/, @ARGV) == 0 || $cmd =~ /(?:^|\W)complete$/ &&
$ARGV[1] =~ /^\//)) { $ARGV[1] =~ /^\//)) {
@ -532,16 +532,6 @@ if ($ARGV[0] =~ /(?:^|\W)(?:scp|sftp)$/) {
splice(@ARGV, 1, 0, ("-S", "$opts{ssh} %H bbcp", "-T", "$opts{ssh} %H bbcp")); splice(@ARGV, 1, 0, ("-S", "$opts{ssh} %H bbcp", "-T", "$opts{ssh} %H bbcp"));
} elsif ($ARGV[0] =~ /(?:^|\W)(?:bbftp|bbscp)$/) { } elsif ($ARGV[0] =~ /(?:^|\W)(?:bbftp|bbscp)$/) {
splice(@ARGV, 1, 0, ("-L", $opts{ssh})); splice(@ARGV, 1, 0, ("-L", $opts{ssh}));
} elsif ($ARGV[0] =~ /(?:^|\W)globus-url-copy$/) {
my $dir = glob("~/.globus");
mkdir $dir if (! -d $dir);
my $file = "$dir/gridftp-ssh";
open(FILE, '>', $file);
print FILE "#!/bin/sh\n$opts{ssh} \$2 sshftp";
close FILE;
chmod(0700, $file);
# reduce $argc since no additional args are spliced onto @ARGV
$argc--;
} elsif ($ARGV[0] =~ /(?:^|\W)pcp\+$/) { } elsif ($ARGV[0] =~ /(?:^|\W)pcp\+$/) {
splice(@ARGV, 1, 0, ("-s", "$opts{ssh_l} $opts{p}")); splice(@ARGV, 1, 0, ("-s", "$opts{ssh_l} $opts{p}"));
} elsif ($ARGV[0] =~ /(?:^|\W)rsync$/) { } elsif ($ARGV[0] =~ /(?:^|\W)rsync$/) {
@ -1372,10 +1362,11 @@ sub shift_ {
print " --mgr-user=USER access manager host as USER\n"; print " --mgr-user=USER access manager host as USER\n";
print " --monitor[=FORMAT] monitor progress of running transfers\n"; print " --monitor[=FORMAT] monitor progress of running transfers\n";
print " (FORMAT one of {color,csv,pad})\n"; print " (FORMAT one of {color,csv,pad})\n";
print " --plot[=[BY:]LIST] plot detailed performance when piped to gnuplot\n"; print " --plot[=[BY[:/]]LIST] plot detailed performance when piped to gnuplot\n";
print " (BY one of {client,host,id,user})\n"; print " (BY one of {client,fs,host,id,net,user})\n";
print " (LIST subset of {chattr,cksum,cp,find,io,ln,meta,\n"; print " (LIST subset of {bbftp,chattr,cksum,cp,find,fish,\n";
print " mkdir,sum})\n"; print " fish-tcp,io,ln,mcp,meta,mkdir,msum,\n";
print " rsync,shift-cp,shift-sum,sum,tool})\n";
print " --restart[=ignore] restart transfer with given --id [ignoring errors]\n"; print " --restart[=ignore] restart transfer with given --id [ignoring errors]\n";
print " --search=REGEX show only status/history matching REGEX\n"; print " --search=REGEX show only status/history matching REGEX\n";
print " --state=STATE show status of only those operations in STATE\n"; print " --state=STATE show status of only those operations in STATE\n";
@ -1395,12 +1386,10 @@ sub shift_ {
print " (use suffix {k,m,b/g,t} for 1E{3,6,9,12}) [1k]\n"; print " (use suffix {k,m,b/g,t} for 1E{3,6,9,12}) [1k]\n";
print " --interval=NUM adjust batches to run for around NUM seconds [30]\n"; print " --interval=NUM adjust batches to run for around NUM seconds [30]\n";
print " --local=LIST set local transport mechanism to one of LIST\n"; print " --local=LIST set local transport mechanism to one of LIST\n";
print " (LIST subset of {bbcp,bbftp,fish,fish-tcp,gridftp,\n"; print " (LIST subset of {bbftp,fish,fish-tcp,mcp,rsync,shift})\n";
print " mcp,rsync,shift})\n";
print " --preallocate=NUM preallocate files when sparsity under NUM percent\n"; print " --preallocate=NUM preallocate files when sparsity under NUM percent\n";
print " --remote=LIST set remote transport mechanism to one of LIST\n"; print " --remote=LIST set remote transport mechanism to one of LIST\n";
print " (LIST subset of {bbcp,bbftp,fish,fish-tcp,gridftp,\n"; print " (LIST subset of {bbftp,fish,fish-tcp,rsync,shift})\n";
print " rsync,shift})\n";
print " --retry=NUM retry failed operations up to NUM times [2]\n"; print " --retry=NUM retry failed operations up to NUM times [2]\n";
print " --size=SIZE process transfer in batches of at least SIZE bytes\n"; print " --size=SIZE process transfer in batches of at least SIZE bytes\n";
print " (use suffix {k,m,g,t} for {KB,MB,GB,TB}) [4g]\n"; print " (use suffix {k,m,g,t} for {KB,MB,GB,TB}) [4g]\n";
@ -1569,10 +1558,22 @@ sub shift_ {
} }
waitpid($pid, 0); waitpid($pid, 0);
=for mesh =for mesh
# use agent key directly since agent will have been killed by child # use new agent since initial agent will have been killed by child
$opts{sshmp} =~ s/^(ssh)/$1 -i $agent_key/; $agent_sock = open3_get([-1, undef, -1], "ssh-agent -c");
if ($agent_sock =~ /SSH_AGENT_PID\s+(\d+);/) {
$opts{k} = $1;
}
$agent_sock = $1 if ($agent_sock =~ /SSH_AUTH_SOCK\s+([^;]+);/);
$ENV{SSH_AUTH_SOCK} = $agent_sock;
open3_get([-1, undef], "ssh-add $agent_key");
=cut mesh =cut mesh
my $out = shift_mgr("--status --state=none --id=$opts{id}"); my $out = shift_mgr("--status --state=none --id=$opts{id}");
=for mesh
if ($opts{k}) {
# kill new agent
kill(SIGTERM, $opts{k}) && waitpid($opts{k}, 0);
}
=cut mesh
if (defined $opts{monitor}) { if (defined $opts{monitor}) {
print "\e[1A\e[K" foreach (1 .. 5); print "\e[1A\e[K" foreach (1 .. 5);
} }
@ -2201,7 +2202,6 @@ sub shift_loop {
next if ($t =~ /^(shift|fish(-tcp)?)$/); next if ($t =~ /^(shift|fish(-tcp)?)$/);
foreach my $path (split(/:/, $ENV{PATH})) { foreach my $path (split(/:/, $ENV{PATH})) {
if (-x "$path/$t" || if (-x "$path/$t" ||
$t eq 'gridftp' && -x "$path/globus-url-copy" ||
$t =~ /^fish(-tcp)$/ && -x "$path/$opts{caux}") { $t =~ /^fish(-tcp)$/ && -x "$path/$opts{caux}") {
$have{$t} = 1; $have{$t} = 1;
last; last;
@ -2230,7 +2230,7 @@ sub shift_loop {
$rtthost->{$1} = 1; $rtthost->{$1} = 1;
} elsif ($args[0] =~ /^(?:exclude|include)$/) { } elsif ($args[0] =~ /^(?:exclude|include)$/) {
$opts{$args[0]} = thaw(unescape($op{text})); $opts{$args[0]} = thaw(unescape($op{text}));
} elsif ($args[0] =~ /^(?:bandwidth|buffer|create-tar|cron|dereference|extract-tar|find-files|force|get_host|ignore-times|index-tar|newer|offline|older|opts_bbcp|opts_bbftp|opts_gridftp|opts_mcp|opts_msum|opts_ssh|ports|preallocate|preserve|recall|sanity|secure|streams|stripe|stripe-pool|stripe-size|sum_split|sum_type|sync|sync_host|threads|window|verify|verify-fast)$/) { } elsif ($args[0] =~ /^(?:bandwidth|buffer|create-tar|cron|dereference|extract-tar|find-files|force|get_host|ignore-times|index-tar|newer|offline|older|opts_bbftp|opts_mcp|opts_msum|opts_ssh|ports|preallocate|preserve|recall|sanity|secure|streams|stripe|stripe-pool|stripe-size|sum_split|sum_type|sync|sync_host|threads|window|verify|verify-fast)$/) {
$opts{$args[0]} = defined $op{text} ? $opts{$args[0]} = defined $op{text} ?
unescape($op{text}) : 1; unescape($op{text}) : 1;
} }
@ -2549,7 +2549,7 @@ sub shift_mounts {
$mnt{opts} = /[\(,]ro[\),]/ ? "ro" : "rw"; $mnt{opts} = /[\(,]ro[\),]/ ? "ro" : "rw";
# acl support is the default unless explicitly disabled # acl support is the default unless explicitly disabled
$mnt{opts} .= ",acl" if (/[\(,]acl[\),]/ || $acl && !/[\(,]noacl[\),]/); $mnt{opts} .= ",acl" if (/[\(,]acl[\),]/ || $acl && !/[\(,]noacl[\),]/);
$mnt{opts} .= ",dmf" if (/[\(,]dm(ap)?i[\),]/); $mnt{opts} .= ",dmf" if (/[\(,](dmapi|dmi|xdsm)[\),]/);
$mnt{opts} .= ",xattr" if (/[\(,]user_xattr[\),]/); $mnt{opts} .= ",xattr" if (/[\(,]user_xattr[\),]/);
#TODO: need to escape local and remote? #TODO: need to escape local and remote?
(my $dev, $mnt{local}, my $type) = ($1, $2, $3) (my $dev, $mnt{local}, my $type) = ($1, $2, $3)
@ -2590,18 +2590,32 @@ sub shift_mounts {
# lustre may have extra @id and multiple colon-separated servers # lustre may have extra @id and multiple colon-separated servers
$mnt{servers} =~ s/@\w*//g; $mnt{servers} =~ s/@\w*//g;
$mnt{servers} = join("|", map {$_ = fqdn($_)} split(/:/, $mnt{servers})); $mnt{servers} = join("|", map {$_ = fqdn($_)} split(/:/, $mnt{servers}));
} elsif ($type eq 'gpfs') { } elsif ($type eq 'beegfs') {
# gpfs servers do not appear in mount output so call mmlsmgr # beegfs servers do not appear in mount output so call beegfs-ctl
my $srv = open3_get([-1, undef, -1], "mmlsmgr $dev"); my $srv = open3_get([-1, undef, -1],
# try a default location if not in path "beegfs-ctl --listnodes --nodetype=management --mount=$mnt{local}");
$srv = open3_get([-1, undef, -1],
"/usr/lpp/mmfs/bin/mmlsmgr $dev") if (!$srv);
next if (!defined $srv); next if (!defined $srv);
# output is file system then server ip address chomp $srv;
if ($srv =~ /^(\w+)\s+(\d+\.\d+\.\d+\.\d+)/m) { # output is host name then id
$mnt{remote} = "/$1"; my @hosts;
$mnt{servers} = fqdn($2); push(@hosts, fqdn($1)) while ($srv =~ /^([\w-.]+)(\s|$)/mg);
next if (!scalar(@hosts));
$mnt{servers} = join("|", @hosts);
$mnt{remote} = "/" . $mnt{servers};
} elsif ($type eq 'gpfs') {
# gpfs servers do not appear in mount output so read config
if (open(FILE, "/var/mmfs/gen/mmfs.cfg")) {
while (<FILE>) {
s/^\s+|\s+$//g;
if (/^clustername\s+(\S+)/i) {
$mnt{servers} = $1;
$mnt{remote} = "/" . $mnt{servers};
last;
}
}
close FILE;
} }
next if (!$mnt{servers});
} elsif ($mnt{opts} =~ /,dmf/) { } elsif ($mnt{opts} =~ /,dmf/) {
# always report dmf file systems even if local # always report dmf file systems even if local
$mnt{servers} = $mnt{host}; $mnt{servers} = $mnt{host};
@ -3254,16 +3268,10 @@ sub transport {
my $tool = $opts{$type}->[ my $tool = $opts{$type}->[
($i + $ref->{try}) % scalar(@{$opts{$type}})]; ($i + $ref->{try}) % scalar(@{$opts{$type}})];
next if ( next if (
# bbcp does not encrypt and cannot handle partial transfers or
# (using --infiles) colon/ff/cr/lf/tab/vt in file names
$tool eq 'bbcp' && ($opts{secure} || $ref->{bytes} ||
"$src$dst" =~ /[:\f\n\r\t\x0b]/) ||
# bbftp does not encrypt and cannot handle partial transfers or # bbftp does not encrypt and cannot handle partial transfers or
# whitespace/vt in file names # whitespace/vt in file names
$tool eq 'bbftp' && ($opts{secure} || $ref->{bytes} || $tool eq 'bbftp' && ($opts{secure} || $ref->{bytes} ||
"$src$dst" =~ /[\s\x0b]/) || "$src$dst" =~ /[\s\x0b]/) ||
# gridftp cannot handle tar ops due to differing src/dst offsets
$tool eq 'gridftp' && defined $ref->{tar_bytes} ||
# rsync cannot handle partial transfers, and # rsync cannot handle partial transfers, and
# (using --files-from) cannot handle cr/lf in file names # (using --files-from) cannot handle cr/lf in file names
$tool eq 'rsync' && ($ref->{bytes} || "$src$dst" =~ /[\n\r]/)); $tool eq 'rsync' && ($ref->{bytes} || "$src$dst" =~ /[\n\r]/));
@ -3298,16 +3306,12 @@ sub transport {
foreach my $tool (keys %tools) { foreach my $tool (keys %tools) {
next if (!scalar(@{$tools{$tool}})); next if (!scalar(@{$tools{$tool}}));
if ($tool eq 'bbcp') { if ($tool eq 'bbftp') {
transport_bbcp($host, $tools{$tool});
} elsif ($tool eq 'bbftp') {
transport_bbftp($host, $tools{$tool}); transport_bbftp($host, $tools{$tool});
} elsif ($tool eq 'fish') { } elsif ($tool eq 'fish') {
transport_fish($host, $tools{$tool}); transport_fish($host, $tools{$tool});
} elsif ($tool eq 'fish-tcp') { } elsif ($tool eq 'fish-tcp') {
transport_fish($host, $tools{$tool}, 1); transport_fish($host, $tools{$tool}, 1);
} elsif ($tool eq 'gridftp') {
transport_gridftp($host, $tools{$tool});
} elsif ($tool eq 'mcp') { } elsif ($tool eq 'mcp') {
transport_mcp($host, $tools{$tool}); transport_mcp($host, $tools{$tool});
} elsif ($tool eq 'rsync') { } elsif ($tool eq 'rsync') {
@ -3330,139 +3334,6 @@ sub transport {
return $rsize; return $rsize;
} }
########################
#### transport_bbcp ####
########################
sub transport_bbcp {
my ($host, $tcmds) = @_;
my %errs;
my ($fh, $tmp) = sftp_tmp();
my $sep = chr(0);
my ($shost, $spath, $dhost, $dpath, $args);
if ($host eq 'localhost') {
$shost = "";
# bbcp assumes host name instead of localhost when host not given
$dhost = "localhost:";
$args = " -S bbcp -T bbcp";
} else {
my $ssh_l;
if ($host =~ /@/) {
($ssh_l, $host) = split(/@/, $host);
$ssh_l = " -l " . $ssh_l if ($ssh_l);
}
$args = " -S '$opts{ssh}$ssh_l %H bbcp' -T '$opts{ssh}$ssh_l %H bbcp'";
}
foreach my $cmd (@{$tcmds}) {
my ($op, $src, $dst, $ref) = @{$cmd};
$ref->{tool} = "bbcp";
if (!defined $shost) {
$shost = $op eq 'get' ? $host . ":" : "";
$dhost = $op eq 'put' ? $host . ":" : "";
}
# find longest common suffix starting with "/"
if ("$src$sep$dst" =~ /^.*?(\/.*)$sep.*\1$/) {
my $lcs = $1;
# bbcp batch mode does not use leading slashes like rsync
$lcs =~ s/^\/+//;
if ($spath && $src eq "$spath/$lcs" && $dst eq "$dpath/$lcs") {
print $fh ($shost ? $shost . ":" : ""), "$lcs\n";
$errs{"$spath/$lcs"}->{$ref} = $ref;
$errs{"$dpath/$lcs"}->{$ref} = $ref;
next;
} elsif ($spath) {
# next file has different prefix so process current batch
close $fh;
transport_bbcp_batch($args . ($shost ? " -z" : ""), $tmp,
"$shost$spath", "$dhost$dpath", \%errs, $host);
%errs = ();
open($fh, '>', $tmp);
}
print $fh ($shost ? $shost . ":" : ""), "$lcs\n";
$spath = $src;
# escape lcs in case it contains regex characters
$spath =~ s/\/\Q$lcs\E$//;
$dpath = $dst;
$dpath =~ s/\/\Q$lcs\E$//;
$errs{"$spath/$lcs"}->{$ref} = $ref;
$errs{"$dpath/$lcs"}->{$ref} = $ref;
} else {
# no common suffix implies single file copy with rename
# or symlink dereference
my %errs_tmp;
# use different hash as other files may already be in there
$errs_tmp{$src}->{$ref} = $ref;
$errs_tmp{$dst}->{$ref} = $ref;
transport_bbcp_batch($args . ($shost ? " -z" : ""), "",
"$shost$src", "$dhost$dst", \%errs_tmp, $host);
}
}
close $fh;
if ($spath) {
transport_bbcp_batch($args . ($shost ? " -z" : ""), $tmp,
"$shost$spath", "$dhost$dpath", \%errs, $host);
}
unlink $tmp;
}
##############################
#### transport_bbcp_batch ####
##############################
sub transport_bbcp_batch {
my ($args, $from, $src, $dst, $errs, $host) = @_;
my ($pid, $in, $out, $size);
$from = " --infiles $from -d" if ($from);
eval {
local $SIG{__WARN__} = sub {die};
# escape remote src/dst metacharacters since interpreted by remote shell
my ($esrc, $edst) = ($src, $dst);
$esrc =~ s/([^A-Za-z0-9\-_.:+\/])/\\$1/g if ($esrc =~ /^[^\/]/);
$edst =~ s/([^A-Za-z0-9\-_.:+\/])/\\$1/g if ($edst =~ /^[^\/]/);
my $nstream = $host eq 'localhost' ? $opts{threads} : $opts{streams};
my $extra;
$extra .= " -B " . $opts{buffer} if ($opts{buffer});
$extra .= " -s " . $nstream if ($nstream);
$extra .= " -w " . $opts{window} if ($opts{window});
$extra .= " -Z " . $opts{ports} if ($opts{ports});
# apply opts_bbcp last to override other settings
$extra .= " " . $opts{opts_bbcp};
# use open3 to avoid executing a shell command based on the name
# of a file being copied (which may contain metacharacters, etc.)
# must keep write access to handle warnings/corruption
$pid = IPC::Open3::open3($in, $out, $out,
# make sure quotewords string does not end in space
quotewords('\s+', 0, "bbcp $extra -AfKv -m 0600$args$from"),
$esrc, $edst);
};
if (!$@) {
while (my $line = <$out>) {
$line =~ s/\s+$//;
if ($line =~ /^File (.*) created(?!.*created)/) {
my $file = $1;
foreach my $key (grep(/^\Q$file\E$/, keys(%{$errs}))) {
$_->{text} = 0 foreach (values %{$errs->{$key}});
}
} elsif ($line =~ /^bbcp: [^\/]*(\/.*)/) {
my $file = $1;
foreach my $key (grep(/^\Q$file\E$/, keys(%{$errs}))) {
sftp_error($_, $line) foreach (values %{$errs->{$key}});
}
}
}
}
close $in;
close $out;
waitpid($pid, 0) if ($pid);
foreach my $key (keys %{$errs}) {
foreach (values %{$errs->{$key}}) {
sftp_error($_, "unknown bbcp failure") if (!defined $_->{text});
}
}
}
######################### #########################
#### transport_bbftp #### #### transport_bbftp ####
######################### #########################
@ -3564,7 +3435,8 @@ sub transport_chattr {
foreach my $cmd (@{$tcmds}) { foreach my $cmd (@{$tcmds}) {
my ($op, $src, $dst, $ref) = @{$cmd}; my ($op, $src, $dst, $ref) = @{$cmd};
# lfs setstripe (must be done before fallocate) # lfs setstripe (must be done before fallocate)
if ((!$opts{'create-tar'} && $op =~ /^(?:get|mkdir|put)$/ || if ((!$opts{'create-tar'} && ($op =~ /^(?:get|put)$/ ||
$op eq 'mkdir' && $ref->{lustre_attrs}) ||
$op eq 'chattr' && $ref->{tar_creat}) && $op eq 'chattr' && $ref->{tar_creat}) &&
$ref->{dstfs} && $ref->{dstfs} =~ /^lustre/ && !$ref->{ln} && $ref->{dstfs} && $ref->{dstfs} =~ /^lustre/ && !$ref->{ln} &&
($opts{stripe} ne '0' || defined $opts{'stripe-size'} || ($opts{stripe} ne '0' || defined $opts{'stripe-size'} ||
@ -3573,23 +3445,28 @@ sub transport_chattr {
my @stripe = (0, 0); my @stripe = (0, 0);
# preserve existing striping when available # preserve existing striping when available
@stripe = split(/,/, $ref->{lustre_attrs}) if ($ref->{lustre_attrs}); @stripe = split(/,/, $ref->{lustre_attrs}) if ($ref->{lustre_attrs});
my @attrs = split(/,/, $ref->{attrs}); # preserve but do not apply striping expressions to directories
# define variables that are allowed in striping expressions if ($op ne 'mkdir') {
my ($nm, $sz, $sc, $ss) = ($dst, $attrs[7], @stripe); my @attrs = split(/,/, $ref->{attrs});
# base striping on tar size instead of file size during tar creation # define variables that are allowed in striping expressions
$sz = $ref->{tar_creat} if ($ref->{tar_creat}); my ($nm, $sz, $sc, $ss) = ($dst, $attrs[7], @stripe);
my @evals = ($opts{stripe}, $opts{'stripe-size'}); # base striping on tar size instead of file size during tar creation
push(@evals, $opts{'stripe-pool'}) $sz = $ref->{tar_creat} if ($ref->{tar_creat});
if ($opts{'stripe-pool'} !~ /^[\w.-]+$/); my @evals = ($opts{stripe}, $opts{'stripe-size'});
# evaluate all striping expressions push(@evals, $opts{'stripe-pool'})
foreach my $i (0 .. 2) { if ($opts{'stripe-pool'} !~ /^[\w.-]+$/);
my $eval = $evals[$i]; # evaluate all striping expressions
next if (!$eval); foreach my $i (0 .. 2) {
$eval =~ s/(NM|SZ|SC|SS)/q($).lc($1)/eg; my $eval = $evals[$i];
$stripe[$i] = eval $eval; next if (!$eval);
$eval =~ s/(NM|SZ|SC|SS)/q($).lc($1)/eg;
$stripe[$i] = eval $eval;
}
# count >= 64k indicates a size per stripe
$stripe[0] = ceil($sz / $stripe[0]) if ($stripe[0] >= 65536);
} }
# count >= 64k indicates a size per stripe # size should never be < 64k (llapi inexplicably returns 2 for dirs)
$stripe[0] = ceil($sz / $stripe[0]) if ($stripe[0] >= 65536); $stripe[1] = 0 if ($stripe[1] < 65536);
my @args = ("setstripe", escape($dst) . ($op eq 'mkdir' ? "/" : ""), my @args = ("setstripe", escape($dst) . ($op eq 'mkdir' ? "/" : ""),
join(" ", @stripe)); join(" ", @stripe));
push(@chattrs, \@args); push(@chattrs, \@args);
@ -4722,116 +4599,6 @@ sub transport_fish_return {
return {error => "Invalid protocol return ($msg)"}; return {error => "Invalid protocol return ($msg)"};
} }
###########################
#### transport_gridftp ####
###########################
sub transport_gridftp {
my ($host, $tcmds) = @_;
my $ssh_l;
if ($host =~ /@/) {
($ssh_l, $host) = split(/@/, $host);
$ssh_l = " -l " . $ssh_l if ($ssh_l);
}
# make sure gridftp-ssh is set up properly
my $prefix = $host ne 'localhost' ? "sshftp://$host" : "file://";
my $dir = glob("~/.globus");
mkdir $dir if (! -d $dir);
my $file = "$dir/gridftp-ssh";
open(FILE, '>', $file);
# note that sshftp must exist in path (normally resides in .globus/sshftp)
print FILE "#!/bin/sh\n$opts{ssh}$ssh_l \$2 sshftp";
close FILE;
chmod(0700, $file);
my %errs;
my ($fh, $tmp);
foreach my $cmd (@{$tcmds}) {
my ($op, $src, $dst, $ref) = @{$cmd};
($fh, $tmp) = sftp_tmp() if (!$tmp);
if ($op eq 'put') {
$src = "file://" . escape($src);
$dst = $prefix . escape($dst);
$errs{"$src $dst"} = $ref;
} else {
$src = $prefix . escape($src);
$dst = "file://" . escape($dst);
$errs{"$src $dst"} = $ref;
}
if ($ref->{bytes}) {
my @ranges = split(/,/, $ref->{bytes});
foreach my $range (@ranges) {
my ($x1, $x2) = split(/-/, $range);
print $fh "$src $dst $x1,", $x2 - $x1, "\n";
}
} else {
print $fh "$src $dst\n";
}
$ref->{tool} = "gridftp";
}
return if (!$tmp);
close $fh;
my $nstream = $host eq 'localhost' ? $opts{threads} : $opts{streams};
my $extra;
$extra .= " -bs " . $opts{buffer} if ($opts{buffer});
$extra .= " -p " . $nstream if ($nstream);
$extra .= " -tcp-bs " . $opts{window} if ($opts{window});
# encrypt data channel during secure transfers
$extra .= " -dcpriv" if ($opts{secure});
# apply opts_gridftp last to override other settings
$extra .= " " . $opts{opts_gridftp};
if ($opts{ports}) {
#TODO: test that this really works (both on open3 side and globus side)
my $ports = $opts{ports};
$ports =~ s/:/,/;
$ENV{GLOBUS_TCP_RANGE} = $ports;
$ENV{GLOBUS_TCP_PORT_RANGE} = $ports;
$ENV{GLOBUS_TCP_SOURCE_RANGE} = $ports;
$ENV{GLOBUS_UDP_PORT_RANGE} = $ports;
$ENV{GLOBUS_UDP_SOURCE_RANGE} = $ports;
}
if (open(OUT, '-|',
# unbuffer must be used to interleave stdout/stderr
"unbuffer globus-url-copy $extra -c -cd -r -v -f $tmp 2>&1")) {
my ($src, $dst, $text);
while (my $line = <OUT>) {
$line =~ s/\s+$//;
if ($line =~ /^Source:\s*(\S+)/) {
if ($dst && $text && $errs{"$src $dst"}) {
sftp_error($errs{"$src $dst"}, $text);
} elsif ($dst && $errs{"$src $dst"}) {
$errs{"$src $dst"}->{text} = 0;
}
$text = undef;
$src = $1;
} elsif ($line =~ /^Dest:\s*(\S+)/) {
$dst = $1;
} elsif ($line =~ /^\s*(\S+)\s*->\s*(\S+)$/) {
$src .= $1;
$dst .= $2;
} elsif ($line =~ /^\s*(\S+)$/) {
$src .= $1;
$dst .= $1;
} elsif ($line && $line !~ /^error: There was an error with/) {
$text .= $line . " ";
}
}
if ($dst && $text && $errs{"$src $dst"}) {
sftp_error($errs{"$src $dst"}, $text);
} elsif ($dst && $errs{"$src $dst"}) {
$errs{"$src $dst"}->{text} = 0;
}
}
close OUT;
foreach my $key (keys %errs) {
if (!defined $errs{$key}->{text}) {
sftp_error($errs{$key}, "unknown gridftp failure");
}
}
unlink $tmp;
}
####################### #######################
#### transport_mcp #### #### transport_mcp ####
####################### #######################
@ -5167,6 +4934,8 @@ sub transport_shift {
push(@scmds, [$x == $x1 ? undef : $x . "-" . $x2, $cmd]); push(@scmds, [$x == $x1 ? undef : $x . "-" . $x2, $cmd]);
} }
} }
# return when no work (mainly when single-threaded)
return if (!scalar(@scmds));
foreach my $o (keys %opts) { foreach my $o (keys %opts) {
# must kill existing sftp connections or various things can hang # must kill existing sftp connections or various things can hang