Discussion:
lftp-4.4.13 -- multi-core/multi-threading support for get on 10GbE networks?
Justin Piszcz
2013-11-28 20:47:54 UTC
Permalink
Hello,

When transferring data on high speed networks (10GbE) lftp hits 100% on a
moderately fast Xeon CPU (E5645), the FTP server is not the bottleneck as it
uses around 37% CPU (different CPU on the server host). Are there any plans
to spin off separate workers (if possible) so a single CPU-core is not a
bottleneck at the client-side?

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

30926 user 20 0 27904 3476 2520 R 100.0 0.0 3:31.88 lftp


lftp client performance:
<--- 150 152837731.5 kbytes to download
`...-11e0-a3b1-806e6f6e6963.vhd' at 87779540224 (56%) 714.06M/s eta:99s
[Receiv
`...-11e0-a3b1-806e6f6e6963.vhd' at 120031762432 (76%) 712.21M/s eta:50s
[Recei
`...-11e0-a3b1-806e6f6e6963.vhd' at 128500216832 (82%) 667.87M/s eta:39s
[Recei
<--- 226-File successfully transferred

<--- 226 222.418 seconds (measured here), 671.06 Mbytes per second
---- Got EOF on data connection

---- Closing data socket
156549891072 bytes transferred in 223 seconds (670.60M/s)

Are there any plans to make lftp multi-threaded (I tried get and also pget
-n 2) the speed was the same (only 650-700 megabytes per second) and even
with pget -n 2 lftp used a single thead, which was also hit 100%.

I tried a cp over NFS (mounted with TCP) and the speed is a bit erratic but
it does seem to be slightly (~200MiB/s) faster.

Device eth4 [192.168.1.2] (1/1):
============================================================================
====
Incoming: Outgoing:
Curr: 0.82 MByte/s Curr: 1137.20 MByte/s
Avg: 0.60 MByte/s Avg: 833.62 MByte/s
Min: 0.24 MByte/s Min: 299.93 MByte/s
Max: 0.82 MByte/s Max: 1157.74 MByte/s
Ttl: 21.46 GByte Ttl: 15691.89 GByte

A timed copy over NFS yielded another ~200MiB/s, which suggests that, if
possible, adding multiple-core support for a single lftp get could improve
performance. I was just curious if this is something that is feasible with
the FTP protocol for a single get(?) I presume you could do something with
pget -- however you may encounter disk thrashing at those high I/O rates
unless you have extremely fast RAID-based backends.

Timed copy test: (176.41 seconds)
0.04user 83.09system 2:56.41elapsed 47%CPU (0avgtext+0avgdata
1688maxresident)k

Justin.
Alexander V. Lukyanov
2013-11-29 07:23:04 UTC
Permalink
Post by Justin Piszcz
When transferring data on high speed networks (10GbE) lftp hits 100% on a
moderately fast Xeon CPU (E5645), the FTP server is not the bottleneck as it
uses around 37% CPU (different CPU on the server host). Are there any plans
to spin off separate workers (if possible) so a single CPU-core is not a
bottleneck at the client-side?
I don't think multithreading is going to be implemented in lftp. I avoided
it from the start as single-threading makes programming and debugging
easier.

But I think it is possible to squeeze more performance by optimization.
First provide me with profiling information (compile with -pg gcc option,
then run lftp, then run gprof, send me the output), then be ready to try
optimized versions to see if they make a difference.
--
Alexander.
Justin Piszcz
2013-11-29 11:58:25 UTC
Permalink
-----Original Message-----
Sent: Friday, November 29, 2013 2:23 AM
To: Justin Piszcz
Subject: Re: [lftp] lftp-4.4.13 -- multi-core/multi-threading support for
get on
10GbE networks?
Post by Justin Piszcz
When transferring data on high speed networks (10GbE) lftp hits 100% on a
moderately fast Xeon CPU (E5645), the FTP server is not the bottleneck
as
it
Post by Justin Piszcz
uses around 37% CPU (different CPU on the server host). Are there any
plans
Post by Justin Piszcz
to spin off separate workers (if possible) so a single CPU-core is not a
bottleneck at the client-side?
I don't think multithreading is going to be implemented in lftp. I avoided
it from the start as single-threading makes programming and debugging
easier.
But I think it is possible to squeeze more performance by optimization.
First provide me with profiling information (compile with -pg gcc option,
then run lftp, then run gprof, send me the output), then be ready to try
optimized versions to see if they make a difference.
--
Alexander.
Hello,

I forgot I had -debug enabled from my earlier testing when we were tracking
down that cls bug, when debug is disabled, lftp is nearly as fast as NFS--
so I think performance is good for now. If further tuning/gprof is needed I
can run through it if necessary but I'm happy with the speeds now.

Device eth4 [192.168.1.2] (1/1):
============================================================================
====
Incoming: Outgoing:
Curr: 0.81 MByte/s Curr: 841.87 MByte/s
Avg: 0.76 MByte/s Avg: 800.37 MByte/s
Min: 0.58 MByte/s Min: 602.02 MByte/s
Max: 0.81 MByte/s Max: 841.87 MByte/s
Ttl: 1.32 GByte Ttl: 203.12 GByte

Justin.

Loading...