|
|
| # | Mon Oct 01 04:07:36 2007 | noreply-savannah@cern.ch - Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Follow-up Comment #2, bug #29930 (project jra1mdw): Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: None Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: maart Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Mon Oct 01 12:08:27 2007 | roy - Requestor noreply-savannah@cern.ch deleted | ||
| # | Mon Oct 01 12:08:28 2007 | roy - Cc john.white@cern.ch added | ||
| # | Mon Oct 01 12:08:28 2007 | roy - Cc David.Smith@cern.ch added | ||
| # | Mon Oct 01 12:08:29 2007 | roy - Requestor litmaath@cern.ch added | ||
| # | Mon Oct 01 12:08:40 2007 | roy - Cc s.burke@rl.ac.uk added | ||
| # | Mon Oct 01 13:30:58 2007 | roy - Correspondence added | [Reply] | |
|
I'll work with Globus immediately to resolve this bug. Two questions for you: > function globus_l_xio_gssapi_ftp_write(), from source file: > globus_xio_gssapi_ftp.c at line aprox 2590: > > res = globus_xio_driver_pass_write( > op, > &handle->auth_write_iov, > 1, > length, > cb, > handle); > > here the argument 'length' refers to the length of the unwrapped > (unencrpyted) and binary (not base64) version of the message (whereas > handle->auth_write_iov.iov_len has the actual length); and thus > 'length' is > about 75% of the actual length of the message that the function wants > to > send. The forth argument to globus_xio_driver_pass_write() is the > 'wait_for' > parameter and sets the minimum amount of data to have sent to the > kernel > before the callback is called. Thus if the kernel accepts more than > about 75% > of the length of the message, but less than all of it, the message is > trunctaed. Are you suggesting that we could simply pass handle->auth_write_iov.iov_len instead of length? Clearly it would require testing, but is it possible that the fix is as simple as that? I understand that this is a critical bug, but I don't have a good feeling for how it is showing up in productions. Does it result in some file transfers failing? What is the symptom of the failure? How often does it occur? Thanks! -alain |
||||
| # | Mon Oct 01 13:30:59 2007 | RT_System - Status changed from 'new' to 'open' | ||
| # | Mon Oct 01 13:30:59 2007 | roy - Given to roy | ||
| # | Mon Oct 01 13:51:14 2007 | roy - Correspondence added | [Reply] | |
|
This is now Globus bug #5590: http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=5590 I'll keep a close eye on it. |
||||
| # | Tue Oct 02 02:06:04 2007 | David.Smith@cern.ch - Correspondence added | [Reply] | |||||||||||
On Oct 1, 2007, at 8:30 PM, Alain Roy via RT wrote: > I'll work with Globus immediately to resolve this bug. Two [...]> questions for > you: > > Are you suggesting that we could simply pass > handle->auth_write_iov.iov_len instead of length? Clearly it would > require testing, but is it possible that the fix is as simple as that? Hello Alain, Thanks for taking this up. I believe passing handle- >auth_write_iov.iov_len instead of length would be fine; in the case we had in production we made a test library which effectively did that, which resolved the problem. However it was a small binary patch (as I wasn't confident of rebuilding the library from source with absolutely no other change just for the test) so strictly speaking we did not try out exactly that source change. > > I understand that this is a critical bug, but I don't have a good > feeling for how it is showing up in productions. Does it result in > some > file transfers failing? What is the symptom of the failure? How often > does it occur? I was a little undecided as to where to place the criticality of the bug - somewhere between major and critical. So far it has only been seen at one production site, and it is difficult to reproduce on a given girdftp2 instance. But we have a relatively small number of DPMs using the new gridftp2, so the concern is that it would become more frequent as the number of instances does. It was a failure in FTS transfers from one site to another - it showed up a problem just in transfers between those two particular sites, the exact conditions required appearing to be complex; it appears to depends on the size of range markers (and thus the size of messages the server was writing on the control channel) and so the transfer rate number of streams, tcp buffer sizes as well as the configuration of the system where the globus gridtp2 is running. The symptom was the error: debug: error reading response from gsiftp://node26.datagrid.cea.fr/ node26.datagrid.cea.fr:/pool_node26/dteam/2007-09-25/david102d. 2397.0: globus_l_ftp_control_read_cb: Error while searching for end of reply debug: fault on connection to gsiftp://node26.datagrid.cea.fr/ node26.datagrid.cea.fr:/pool_node26/dteam/2007-09-25/david102d. 2397.0: globus_l_ftp_control_read_cb: Error while searching for end of reply debug: error reading response from gsiftp://ccxfer13.in2p3.fr:2811// pnfs/in2p3.fr/data/dteam/disk/dapnia/cleroy/ATLAS-filenode20_007: an I/O operation was cancelled debug: operation complete error: globus_l_ftp_control_read_cb: Error while searching for end of reply the above is the client side error I made which investigating. It was reported by the ftp client that was handling the 3rd party copy, an older globus 2 based client in this case. In production the result is an FTS error, with a similar error to the above being available from the FTS server when the transfer job status is queried. Yours, David -- ------------------------------------------------------------------------ - David Smith e-mail: David.Smith@cern.ch tel: +41 22 76 70677 Address: D. Smith, CERN G06210, Bat 28 1-015, 1211 Geneva 23, Switzerland ------------------------------------------------------------------------ - |
||||||||||||||
| # | Mon Oct 08 17:17:20 2007 | roy - Correspondence added | [Reply] | |
|
We got a comment from Michael Link (from Globus) today. http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=5590 > That change looks good to me. I'll commit a fix for that and release an update> package soon. fixing> > The changes you noted in the current CVS trunk have the side affect of > the problem as well, but they were designed for better threaded performance> along with other changes throughout the package, so they aren't really suitable> for porting to the 4.0.x branch. > > Thanks for reporting this. I'll see what his update contains, then we can produce an updated Globus for VDT 1.6.1 and get it to you ASAP. -alain |
||||
| # | Mon Oct 08 17:17:38 2007 | roy - Priority changed from (no value) to '3' | ||
| # | Mon Oct 08 17:17:38 2007 | roy - Fix scheduled OLD added | ||
| # | Tue Oct 09 04:09:43 2007 | David.Smith@cern.ch - Correspondence added | [Reply] | |||||||||||
On Oct 9, 2007, at 12:17 AM, Alain Roy via RT wrote: > We got a comment from Michael Link (from Globus) today. > > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=5590 >> That change looks good to me. I'll commit a fix for that and release > an update>> package soon. > fixing>> >> The changes you noted in the current CVS trunk have the side >> affect of >> the problem as well, but they were designed for better threaded > performance>> along with other changes throughout the package, so they aren't > suitable>> really >> for porting to the 4.0.x branch. >>> >> Thanks for reporting this. > I'll see what his update contains, then we can produce an updated > Globus > for VDT 1.6.1 and get it to you ASAP. Hello Alain, That's great news. Thanks for following that up with globus (as well as the subsequent work that is needed to make a VDT update). By the way I opened another bug, savannah bug https:// savannah.cern.ch/bugs/?30106, which is also a globus issue. I only describe that as 'normal' severity. But I wanted to check that you'd seen that I copied vdt-support on it. It is probably another small, localized fix - but again it will need to be checked with globus to see if they agree or have other comments. Yours. David -- ------------------------------------------------------------------------ - David Smith e-mail: David.Smith@cern.ch tel: +41 22 76 70677 Address: D. Smith, CERN G06210, Bat 28 1-015, 1211 Geneva 23, Switzerland ------------------------------------------------------------------------ - |
||||||||||||||
| # | Thu Dec 20 10:12:45 2007 | roy - Comments added | [Reply] | |
|
This was released to EGEE as a 1.6.1 update, but not in the Pacman 1.6.1. It's in 1.8.1 (where it came as an update) and forward. |
||||
| # | Thu Dec 20 10:12:45 2007 | roy - Status changed from 'open' to 'resolved' | ||
| # | Fri Jan 18 06:00:37 2008 | noreply-savannah@cern.ch - Ticket 3236: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: Ready for Test => Ready for Review ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: Ready for Review Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: egeetest Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Fri Jan 18 06:00:37 2008 | noreply-savannah@cern.ch - Ticket 3235: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: Accepted => In progress ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: In progress Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: maart Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Fri Jan 18 06:00:37 2008 | noreply-savannah@cern.ch - Ticket 3237: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: In progress => Integration Candidate ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: Integration Candidate Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: maart Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Fri Jan 18 06:00:55 2008 | noreply-savannah@cern.ch - Ticket 3238: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: None => Accepted ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: Accepted Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: maart Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Fri Jan 18 06:01:01 2008 | noreply-savannah@cern.ch - Ticket 3239: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: Integration Candidate => Ready for Test ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 15:18 Status: Ready for Test Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: egeetest Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2007-10-01 09:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 08:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Wed Jan 23 16:03:22 2008 | cat - Ticket 3236: Merged into | ||
| # | Wed Jan 23 16:03:36 2008 | cat - Ticket 3237: Merged into | ||
| # | Wed Jan 23 16:03:52 2008 | cat - Ticket 3238: Merged into | ||
| # | Wed Jan 23 16:04:33 2008 | cat - Ticket 3239: Merged into | ||
| # | Wed Feb 13 03:05:25 2008 | noreply-savannah@cern.ch - Ticket 3313: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Follow-up Comment #3, bug #29930 (project jra1mdw): Fixed in VDT 1.6.1i: http://vdt.cs.wisc.edu/releases/1.6.1/release.html ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 17:18 Status: Ready for Review Open/Closed: Open Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: egeetest Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2008-02-13 10:04 By: FROHNER Akos <szamsu> Fixed in VDT 1.6.1i: http://vdt.cs.wisc.edu/releases/1.6.1/release.html ------------------------------------------------------- Date: 2007-10-01 11:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 10:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 667 | -COM- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Wed Feb 20 03:23:16 2008 | noreply-savannah@cern.ch - Ticket 3328: Ticket created | [Reply] | |||||||||
This is an automated notification sent by LCG Savannah. It relates to: bugs #29930, project gLite Middleware ============================================================================== LATEST MODIFICATIONS of bugs #29930:============================================================================== Update of bug #29930 (project jra1mdw): Status: Ready for Review => Fixed _______________________________________________________ Follow-up Comment #4: Hello, We've verified the relevant globus CVS change is reflected in vdt_globus_data_server-VDT1.6.1x86_64_rhas_4-6 and that manually (i.e. with a debugger) provoking the original condition associated with the problem now gives good behavior. The site which was experiencing the fault is no longer seeing any problem - even with the old software. Never the less I believe this bug can considered fixed now. Thanks to those involved in getting this change made. David ============================================================================== OVERVIEW of bugs #29930:============================================================================== URL: <http://savannah.cern.ch/bugs/?29930> Summary: gridftp2 server can send truncated control channel messages Project: gLite Middleware Submitted by: dhsmith Submitted on: 2007-09-27 17:18 Status: Fixed Open/Closed: Closed Category: Globus Severity: 6 - Critical Baseline Release: Unknown OS: None Architecture: None Bug detection area: Production Assigned to: egeetest Priority: 5 - Enhancement GGUS reference URL: https://gus.fzk.de/ws/ticket_info.php?ticket=26868 Component tag(s): Subsystem tag(s): Discussion Lock: Any Build environment: None Release: _______________________________________________________ Hello, We recently had a problem reported by a site using a 64bit, SL4 release of DPM (and hence the gridftp2 server) that was traced to an internal globus/gridftp2 lib issue: from VDT RPM vdt_globus_data_server-VDT1.6.0x86_64_rhas_4-3 in library(s) libglobus_gridftp_server_control_* function globus_l_xio_gssapi_ftp_write(), from source file: globus_xio_gssapi_ftp.c at line aprox 2590: res = globus_xio_driver_pass_write( op, &handle->auth_write_iov, 1, length, cb, handle); here the argument 'length' refers to the length of the unwrapped (unencrpyted) and binary (not base64) version of the message (whereas handle->auth_write_iov.iov_len has the actual length); and thus 'length' is about 75% of the actual length of the message that the function wants to send. The forth argument to globus_xio_driver_pass_write() is the 'wait_for' parameter and sets the minimum amount of data to have sent to the kernel before the callback is called. Thus if the kernel accepts more than about 75% of the length of the message, but less than all of it, the message is trunctaed. Browsing the globus CVS I noticed that recent versions/branches of globus_xio_gssapi_ftp.c have been reworked and probably do not suffer from this problem - although I don't know if exactly this problem was ever recognised explictly. I don't know if there are any globus/gridftp2 releases based on the new code, but it is likely we will find problems again from this bug so it is very desirable to have this fixed - either as a fix to the older version or by moving to the newer code. Thank you, David _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: 2008-02-20 10:10 By: David Smith <dhsmith> Hello, We've verified the relevant globus CVS change is reflected in vdt_globus_data_server-VDT1.6.1x86_64_rhas_4-6 and that manually (i.e. with a debugger) provoking the original condition associated with the problem now gives good behavior. The site which was experiencing the fault is no longer seeing any problem - even with the old software. Never the less I believe this bug can considered fixed now. Thanks to those involved in getting this change made. David ------------------------------------------------------- Date: 2008-02-13 10:04 By: FROHNER Akos <szamsu> Fixed in VDT 1.6.1i: http://vdt.cs.wisc.edu/releases/1.6.1/release.html ------------------------------------------------------- Date: 2007-10-01 11:02 By: Maarten Litmaath <maart> Dear VDT Support, please look into this bug considered critical for WLCG/EGEE. ------------------------------------------------------- Date: 2007-09-28 10:53 By: David Smith <dhsmith> Hello, Moved to critical - as I expect this problem will lead to service instability anywhere where we deploy gridftp2, which is an increasing number of the DPM sites. David _______________________________________________________ Carbon-Copy List: CC Address | Comment ------------------------------------+----------------------------- 667 | -COM- 522 | -UPD- 488 | -COM- vdt-support@opensciencegrid.org | 868 | -UPD- sburke | 645 | -SUB- ============================================================================== This item URL is: <http://savannah.cern.ch/bugs/?29930> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/ |
||||||||||||
| # | Fri Feb 22 17:08:00 2008 | roy - Ticket 3235: Merged into | ||
| # | Fri Feb 22 17:08:05 2008 | roy - Ticket 3313: Merged into | ||
| # | Fri Feb 22 17:08:15 2008 | roy - Ticket 3328: Merged into | ||
| # | Fri Feb 22 17:09:06 2008 | roy - Requestor noreply-savannah@cern.ch deleted | ||
| # | Fri Feb 22 17:09:44 2008 | roy - Correspondence added | [Reply] | |
|
Great! We'll move this into new versions of the VDT as well then. Thanks, -alain > We've verified the relevant globus CVS change is reflected in > > vdt_globus_data_server-VDT1.6.1x86_64_rhas_4-6 > > and that manually (i.e. with a debugger) provoking the original > condition > associated with the problem now gives good behavior. The site which > was > experiencing the fault is no longer seeing any problem - even with the > old > software. Never the less I believe this bug can considered fixed now. > > Thanks to those involved in getting this change made. > > David > > ============================================================================== > OVERVIEW of bugs #29930: > ============================================================================== > > URL: > <http://savannah.cern.ch/bugs/?29930> > > Summary: gridftp2 server can send truncated control > channel > messages > Project: gLite Middleware > Submitted by: dhsmith > Submitted on: 2007-09-27 17:18 > Status: Fixed > Open/Closed: Closed > Category: Globus > Severity: 6 - Critical > Baseline Release: Unknown > OS: None > Architecture: None > Bug detection area: Production > Assigned to: egeetest > Priority: 5 - Enhancement > GGUS reference URL: > https://gus.fzk.de/ws/ticket_info.php?ticket=26868 > Component tag(s): > Subsystem tag(s): > Discussion Lock: Any > Build environment: None > Release: |
||||
| # | Fri Feb 22 17:09:45 2008 | RT_System - Status changed from 'resolved' to 'open' | ||
| # | Wed Mar 05 21:10:55 2008 | roy - Status changed from 'open' to 'resolved' | ||
Time to display: 3.459806
»|« RT 3.8.2 Copyright 1996-2008 Best Practical Solutions, LLC.