On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
> Hello,
>
> Native FTP Relay support is finally ready: Squid receives native FTP
> commands on ftp_port and relays FTP messages between FTP clients and FTP
> origin servers through the existing Squid access control, adaptation,
> and logging layers. A compressed 70KB patch attached to this email is
> also temporary available at [1]. I would like to merge the corresponding
> lp branch[2] into Squid trunk.
>
> This is a large, complex, experimental feature. I am sure there are
> cases where it does not work well or does not support some existing
> features. I am not aware of any regressions specific to old FTP gateway
> code, and I hope there are no regressions specific to HTTP code, but I
> do not recommend deployment without testing.
>
> The branch code worked quite well in limited production deployment, but
> the final version (with code restructuring and polishing for the
> official submission) has only seen simple lab testing. AFAIK, the FTP
> interception path has not seen any production deployment at all.
>
> This code represents more than a year worth of development, started by
> Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
> finished Dmitry's work. I bear responsibility for the bugs, but there
> are probably areas of Dmitry's code that appear to work well and that
> neither Christos nor I have studied.
>
> Overall, we tried to focus on the new FTP code and leave the old
> FTP/HTTP code alone, except where restructuring was necessary to avoid
> code duplication. For example, the server-facing side reuses a lot of
> the old FTP code, while the relayed FTP commands are passed around in
> HttpMsg structures using the old ClientStreams, doCallout(), and Store
> interfaces.
>
> If you review the patch, please note that bzr is incapable of tracking
> code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
> three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
> as newly added code in the patch. Usually, one can figure it out by
> searching for the similar but deleted code in the patch.
>
> Please see the above referenced lp branch for a detailed change log.
>
>
> Some Native FTP Relay highlights:
>
> * Added ftp_port directive telling Squid to relay native FTP commands.
> * Active and passive FTP support on the user-facing side; require
> passive connections to come from the control connection src IP.
> * IPv6 support (EPSV and, on the user-facing side, EPRT).
> * Intelligent adaptation of relayed FTP FEAT responses.
> * Relaying of multi-line FTP control responses using various formats.
> * Support relaying of FTP MLSD and MLST commands (RFC 3659).
> * Several Microsoft FTP server compatibility features.
> * ICAP/eCAP support (at individual FTP command/response level).
> * Optional "current FTP directory" tracking (cannot be 100% reliable due
> to symbolic links and such, but is helpful in some common use cases).
> * FTP origin control connection is pinned to the FTP user connection.
> * No caching support -- no reliable Request URIs for that (see above).
> * Significant FTP code restructuring on the server-facing side.
> * Initial steps towards HTTP code restructuring on the client-facing
> side.
>
>
> Changes to the general code used by the Native FTP Relay code:
>
>
>> * The user- and origin-facing code restructured as agreed previously on
>> this mailing list. I will email some thoughts about the results separately,
>> but here is the executive summary:
>>
>> src/servers/FtpServer.* # new FTP server, relaying FTP
>> src/servers/HttpServer.* # old ConnStateData parts conflicting with FtpServer
>> src/clients/FtpClient.* # code shared by old and new FTP clients
>> src/clients/FtpGateway.* # old FTP client, translating back to HTTP
>> src/clients/FtpRelay.* # new FTP client, relaying FTP
>> src/ftp/* # FTP stuff shared by clients and servers
>>
>>
>> * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.
>>
>> * Tokenizer fixes and API improvements:
>>
>> Taught Tokenizer to keep track of the number of parsed bytes. Many callers
>> need to know that because they need to adjust/consume I/O offsets/buffers.
>>
>> Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
>> specially. Besides the fact that not all grammars can treat NUL that way, the
>> special NUL treatment resulted in some token() calls returning true for empty
>> tokens, which was confusing parsers. Callers that do not require trailing
>> delimiters, should use prefix() instead. This change is based on experience
>> writing Tokenizer-based FTP parser, although the final parser code uses
>> prefix() instead of token(), for unrelated reasons.
>>
>> Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
>> skip() methods skip exactly one thing (either a given character or a given
>> token) but the old skip(set) method was skipping multiple things. That
>> confused callers. We now force the caller to make a choice.
>>
>> Fixed Parser::Tokenizer::skip(char) to avoid out of bound access.
>>
>>
>> * CharacterSet enhancements:
>>
>> Added CharacterSet::complement() to create "all except those in that set" sets
>> handy for parsing (e.g., "get all characters until the end of line").
>>
>> Added CharacterSet::rename() to label sets. Handy in const declarations that
>> use expressions. For example: const CharacterSet AB = (A+B).renamed("AB").
>>
>>
>> * SBuf fixes:
>>
>> Do not allow SBuf::toLower/toUpper() return a value.
>>
>> Some callers think toLower() method changes the string and some think it
>> returns a changed value without changing the string. Without unportable hacks
>> like __attribute__((warn_unused_result)), some of the callers will guess wrong
>> and not know about it. It is easier/safer to change the object itself and
>> return nothing.
>>
>> Provided ToUpper/ToLower() functions (not methods!) that preserve their
>> paramter and return a new object for callers that need the old functionality.
>> These functions are a good candidate for __attribute__ hacks.
>
>
>
> Thank you,
>
> Alex.
>> [1] http://www.measurement-factory/tmp/attachments/ftp-native-r12814.patch.gz
>> [2] https://code.launchpad.net/~measurement-factory/squid/ftp-native
>
Audit results (part 1):
* Please apply CharacterSet updates separately.
* Please apply Tokenizer API updates separately.
* please apply src/HttpHdrCc.h changes separately.
All of the above can go in immediately IMO.
in src/FwdState.cc:
* at line 817 calling request->hier.note() twice in a row with different
values is both needless change and wasting CPU.
in src/HttpHeader.cc:
* please enact the new TODO about stripping FTP-Arguments "header" on
PASS command. This information leak could earn us a CVE if merged as-is.
in src/HttpHeaderTools.cc:
* httpHeaderQuoteString() could be optimized with needInnerQuote as
SBuf::size_type to append in blocks between characters needing quote.
in src/Server.cc:
* this change looks semantically wrong:
- if (!doneWithServer()) {
+ if (mayReadVirginReplyBody()) {
closeServer();
... because the action resulting is to close the server connection.
We may still need to write to this server, or to read some Trailers
after the reply body has finished.
in src/Server.h:
* I'm not sure the call order of doneWithServer() and
mayReadVirginReplyBody() is the right way around. It seems like "being
done completely with a server" includes "whether more reply is coming"
as one sub-condition of several (see above), not the reverse.
in src/anyp/PortCfg.cc:
* at line 199 please s/http(s)_port/*_port/ for the fatalf() message so
it no longer mentions http(s)_port when parsing ftp_port.
* please make the protocol=FTP case sensitive upper case "FTP".
- the variants with lower case http/https and without version numbers
are only accepted for backward compatibility. Which this new feature
does not suffer from.
* is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
AnyP::ProtocolVersion parameters)
I suspect its actually closer to 2.2. If 1.x was RFC 354 and earlier
with FTP on port 3, and 2.0 moved FTP to port 21 from RFC 542, 2.1 in
RFC 765, and 2.2 in current RFC 959.
- NP: 0,0 version should be usable and probably best since there does
not appear to be any actual official version numbering for FTP.
in src/cache_cf.cc:
* squid.conf on/off parameters for directives generally do not
explicitly state =on/=off in their naming unless there are other values
to differentiate in the =value part.
- "ftp-track-dirs" is sufficient with a default of false.
in src/cf.data.pre:
* please document tproxy, name=, protocol=FTP as supported on ftp_port
* ... "client_idle_pconn_timeout s/used/uses/ for incoming"...
in src/clientStreamForward.h:
* why is enums.h included? please document.
* please remove the "\ingroup ClientStreamAPI" since you are shuffling
their related code
- we agreed to drop the doxygen "modules" thing which is what these
create.
- these can be replaced with a single TODO on creating a ClientStream
namespace instead.
in src/client_side.h:
* what is "waiting for future HttpsServer home" meant to mean on
postHttpsAccept()
* please document finishDechunkingRequest() properly now that it's
shifted scope.
* do not need to add protected: twice.
- or was the second one meant to be private: since it is now exposing
the entire set of private members and variables?
- if the privates are now supposed to be protected please TODO about
renaming them all from private_ style.
in src/clients/FtpClient.cc:
* needless empty line after squid.h include. same in FtpRelay.cc
* missing empty line between "" includes block and <> ones.
* please remove all "\ingroup ServerProtocolFTPInternal" this grouping
is replaced by the namespace already.
Ran out of time at src/clients/FtpRelay.cc. More to follow later.
Amos
Received on Fri Aug 08 2014 - 15:48:56 MDT
This archive was generated by hypermail 2.2.0 : Fri Aug 08 2014 - 12:00:12 MDT