Skip Menu | Logged in as guest | Logout
 
Ticket metadata
Id: 4897
Status: resolved
Priority: 9/0
Queue: vdt-support

Fixed in: 1.10.1u
Fix scheduled: CUR

Owner: Tim Cartwright
Requestors: Alan.Sill@ttu.edu
greenc@fnal.gov
Cc:
AdminCc:

More about Alan.Sill@ttu.edu
Comments about this user:
No comment entered about this user
This user's 10 highest priority tickets:
Groups this user belongs to:
  • Everyone

More about greenc@fnal.gov
Comments about this user:
No comment entered about this user
This user's 10 highest priority tickets:
Groups this user belongs to:
  • Everyone
  • Unprivileged

New reminder:

Created: Mon Feb 16 14:39:47 2009
Starts: Not set
Started: Not set
Last Contact: Mon Mar 02 14:41:58 2009
Due: Not set
Closed: Fri Mar 06 16:57:09 2009
Updated: Fri Mar 06 16:57:09 2009 by cat



History Brief headersFull headers
CC: Alan Sill <Alan.Sill@ttu.edu>, Local Grid Accounting List <grid-accounting@fnal.gov>
Subject: [Fwd: LSF probe update failed]
Date: Mon, 16 Feb 2009 14:39:07 -0600
To: vdt-support@OPENSCIENCEGRID.ORG
From: Chris Green <greenc@fnal.gov>
Hi,

Alan Sill has reported the below-mentioned problem upgrading his LSF probe. Please open a ticket with Alan as the requester and cc: grid-accounting@fnal.gov.

Thanks,
Chris.

-------- Original Message -------- Subject: LSF probe update failed Date: Mon, 16 Feb 2009 14:16:30 -0600 From: Alan Sill <alan.sill@ttu.edu> To: Chris Green <greenc@fnal.gov> CC: Alan Sill <alan.sill@ttu.edu> References: <725DCE58-23D8-4B6E-AF21-AB8A6A7567BD@ttu.edu> <4999C80C.8010707@fnal.gov>

On Feb 16, 2009, at 2:09 PM, Chris Green wrote:

> If you run managedfork, you get the Condor probe and it *is*  
> appropriate
> to use it.

OK, that worked - the Condor probe updated without errors.  But for LSF:

[root@antaeus grid]# pacman -update Gratia-LSF-Probe
Update of [/usr/local/OSG_1_0_0:http://vdt.cs.wisc.edu/vdt_1101_cache:Gratia-LSF-Probe 
] found...
WARNING: Uninstall shell command [vdt/sbin/vdt-uninstall Gratia-LSF- 
Probe] has failed [vdt-uninstall failed: directory '/usr/local/ 
OSG_1_0_0/gratia/probe/lsf/libexec' is not empty but has a backup
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/LICENSE'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/ProbeConfig'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/README'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf.py'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs- 
lsf_meter.cron.sh'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf_meter.pl'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/ 
lsb.acct'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/ 
lsb.events'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/ 
lsb.events.1'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/ 
lsb.events.index'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/ 
20060601'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/ 
20060605'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/ 
20060606'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/ 
20060607'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/ 
20060608'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.conf'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.pl'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/ 
Common.pm'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/ 
Configuration.pm'
(main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCreator'
rmdir '/usr/local/OSG_1_0_0/gratia/var/tmp/urCollector'
rmdir '/usr/local/OSG_1_0_0/gratia/var/lock'
rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector'
rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir'
rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir'
rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test'].  Ignoring...
Can't uninstall [Gratia-LSF-Probe].  Not updated...


-- 
Chris Green <greenc@fnal.gov>, FNAL CD/SCF/GRID; 'phone (630) 840-2167.
IRC: greenc@jabber.fnal.gov, ChrisGreen@jabber.dsd.lbl.gov;
chissgreen (AIM, Yahoo); chissg@hotmail.com (MSNM);
chris.h.green (Google Talk).
Download smime.p7s
application/x-pkcs7-signature 3.9k
CC: Alan Sill <alan.sill@ttu.edu>, "vdt-support@opensciencegrid.org" <vdt-support@OPENSCIENCEGRID.ORG>, Local Grid Accounting List <grid-accounting@fnal.gov>
Subject: Re: [Fwd: LSF probe update failed]
Date: Mon, 16 Feb 2009 15:26:19 -0600
To: Chris Green <greenc@fnal.gov>
From: Alan Sill <alan.sill@ttu.edu>
Download (untitled) / with headers
text/plain 3.8k
I was able to make progress after the failure reported below by simply
installing, rather than updating, the Gratia-LSF-Probe component.

Thanks,
Alan

On Feb 16, 2009, at 2:39 PM, Chris Green wrote:

> Hi,
>
> Alan Sill has reported the below-mentioned problem upgrading his LSF
> probe. Please open a ticket with Alan as the requester and cc: grid-accounting@fnal.gov
> .
>
> Thanks,
> Chris.
>
> -------- Original Message --------
> Subject:
> LSF probe update failed
> Date:
> Mon, 16 Feb 2009 14:16:30 -0600
> From:
> Alan Sill <alan.sill@ttu.edu>
> To:
> Chris Green <greenc@fnal.gov>
> CC:
> Alan Sill <alan.sill@ttu.edu>
> References:
> <725DCE58-23D8-4B6E-AF21-AB8A6A7567BD@ttu.edu> <4999C80C.8010707@fnal.gov
> >
>
>
> On Feb 16, 2009, at 2:09 PM, Chris Green wrote:
>
> > If you run managedfork, you get the Condor probe and it *is*
> > appropriate
> > to use it.
>
> OK, that worked - the Condor probe updated without errors. But for
> LSF:
>
> [root@antaeus grid]# pacman -update Gratia-LSF-Probe
> Update of [/usr/local/OSG_1_0_0:http://vdt.cs.wisc.edu/vdt_1101_cache:Gratia-LSF-Probe
> ] found...
> WARNING: Uninstall shell command [vdt/sbin/vdt-uninstall Gratia-LSF-
> Probe] has failed [vdt-uninstall failed: directory '/usr/local/
> OSG_1_0_0/gratia/probe/lsf/libexec' is not empty but has a backup
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/LICENSE'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/ProbeConfig'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/README'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf.py'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-
> lsf_meter.cron.sh'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf_meter.pl'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.acct'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events.1'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events.index'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060601'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060605'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060606'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060607'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060608'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.conf'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.pl'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/
> Common.pm'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/
> Configuration.pm'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCreator'
> rmdir '/usr/local/OSG_1_0_0/gratia/var/tmp/urCollector'
> rmdir '/usr/local/OSG_1_0_0/gratia/var/lock'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test']. Ignoring...
> Can't uninstall [Gratia-LSF-Probe]. Not updated...
>
>
> --
> Chris Green <greenc@fnal.gov>, FNAL CD/SCF/GRID; 'phone (630)
> 840-2167.
> IRC: greenc@jabber.fnal.gov, ChrisGreen@jabber.dsd.lbl.gov;
> chissgreen (AIM, Yahoo); chissg@hotmail.com (MSNM);
> chris.h.green (Google Talk).

Alan Sill, Ph.D
Senior Scientist, High Performance Computing Center
Adjunct Professor of Physics
TTU

====================================================================
: Alan Sill, Texas Tech University Office: Admin 233, MS 4-1167 :
: e-mail: Alan.Sill@ttu.edu ph. 806-742-4350 fax 806-742-4358 :
====================================================================
Download (untitled) / with headers
text/plain 3.7k
This failure makes me worried. I'm glad that Alan Sill is working now,
but if you have the chance, could you please run the following commands?

cd $VDT_LOCATION
. setup.sh (or source setup.csh, as appropriate)
vdt-system-profiler

Then mail me the vdt-profile.txt that is created. This will include the
vdt-install.log that will help us understand the problem.

Thanks,
-alain

On Mon Feb 16 14:39:48 2009, greenc@fnal.gov wrote:
> Hi,
>
> Alan Sill has reported the below-mentioned problem upgrading his LSF
> probe.
> Please open a ticket with Alan as the requester and cc:
> grid-accounting@fnal.gov.
>
> Thanks,
> Chris.
>
> -------- Original Message --------
>
> Subject:
>
> LSF probe update failed
>
> Date:
>
> Mon, 16 Feb 2009 14:16:30 -0600
>
> From:
>
> Alan Sill <alan.sill@ttu.edu>
>
> To:
>
> Chris Green <greenc@fnal.gov>
>
> CC:
>
> Alan Sill <alan.sill@ttu.edu>
>
> References:
>
> <725DCE58-23D8-4B6E-AF21-AB8A6A7567BD@ttu.edu>
> <4999C80C.8010707@fnal.gov>
>
>
>
> On Feb 16, 2009, at 2:09 PM, Chris Green wrote:
>
> > If you run managedfork, you get the Condor probe and it *is*
> > appropriate
> > to use it.
>
> OK, that worked - the Condor probe updated without errors. But for
> LSF:
>
> [root@antaeus grid]# pacman -update Gratia-LSF-Probe
> Update of
> [/usr/local/OSG_1_0_0:http://vdt.cs.wisc.edu/vdt_1101_cache:Gratia-
> LSF-Probe
> ] found...
> WARNING: Uninstall shell command [vdt/sbin/vdt-uninstall Gratia-LSF-
> Probe] has failed [vdt-uninstall failed: directory '/usr/local/
> OSG_1_0_0/gratia/probe/lsf/libexec' is not empty but has a backup
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/LICENSE'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/ProbeConfig'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/README'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf.py'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-
> lsf_meter.cron.sh'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/pbs-lsf_meter.pl'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.acct'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events.1'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir/
> lsb.events.index'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060601'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060605'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060606'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060607'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir/
> 20060608'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.conf'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector.pl'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/
> Common.pm'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector/
> Configuration.pm'
> (main) unlink '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCreator'
> rmdir '/usr/local/OSG_1_0_0/gratia/var/tmp/urCollector'
> rmdir '/usr/local/OSG_1_0_0/gratia/var/lock'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/urCollector'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/pbs-logdir'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test/lsf-logdir'
> rmdir '/usr/local/OSG_1_0_0/gratia/probe/lsf/test']. Ignoring...
> Can't uninstall [Gratia-LSF-Probe]. Not updated...
>
>
>
> --
> Chris Green <greenc@fnal.gov>, FNAL CD/SCF/GRID; 'phone (630) 840-
> 2167.
> IRC: greenc@jabber.fnal.gov, ChrisGreen@jabber.dsd.lbl.gov;
> chissgreen (AIM, Yahoo); chissg@hotmail.com (MSNM);
> chris.h.green (Google Talk).
Download (untitled) / with headers
text/plain 2.5k
I think I understand where this problem comes from. It's very tricky. In a
nutshell, it seems to be caused by multiple updates to a package containing a
symlink to '.'. The Gratia-LSF-Probe package happens to have 2 such symlinks.

To fully understand what happens, one must understand some of the behaviors of
vdt-untar and vdt-uninstall. First, vdt-untar: When installing a symlink, it
checks to see if a filesystem entry already exists in the target location. If
so, it moves the original entry to a backup, then copies the new symlink into
place. It does so even when the target symlink points to the same location as
the source symlink; the reasons for this are documented somewhere...

Now when vdt-uninstall goes to delete a symlink to a directory (which includes a
symlink to '.'), there are two important considerations. One is whether the
target directory is empty -- if it is not empty, then vdt-uninstall refuses to
remove the symlink, because some other package might still need it to refer to
the contained files. The other consideration is whether the package being
deleted has a backup of the target directory, which means that the package is
responsible for restoring the backup. If there's a symlink to a directory that
(a) is not empty and thus cannot be deleted and (b) has a backup, the script
fails because it cannot decide which contents to keep. This is the failure that
Alan Sill ran into.

Why did this happen? When Gratia-LSF-Probe is first installed, it lays down the
symlinks to '.' with no problem and logs them in its filelist. When the package
is updated, however, the problem starts. On the first update, vdt-uninstall
finds a symlink to a directory ('.') which is not empty (because, among other
things, the symlink itself is contained within '.'). Thus, the symlink is left
in place. In the second phase of the update, when the package is being
installed again, vdt-untar finds the original symlink in place, backs it up, and
replaces it with an identical copy. This is expected. However, at this point,
we have the conditions for the failure: The symlink cannot be removed (because
its containing directory is never empty) AND it has a backup.

Because the problem actually occurs in vdt-uninstall, and in part is caused by
the file management subsystem's inability to handle filesystem/symlink loops, we
can "treat" the problem there for now. Specifically, we've decided to add
special code to the uninstaller to allow the removal of a symlink to '.'.
Handling all cases of looping is being left for a later date, if needed.
Download (untitled) / with headers
text/plain 1008b
Alan & Chris:

> OK, that worked - the Condor probe updated without errors. But for LSF:

> [root@antaeus grid]# pacman -update Gratia-LSF-Probe
> Update of [/usr/local/OSG_1_0_0:http://vdt.cs.wisc.edu/vdt_1101_cache:Gratia-LSF-Probe] found...
> WARNING: Uninstall shell command [vdt/sbin/vdt-uninstall Gratia-LSF-Probe] has failed
> [vdt-uninstall failed:
> directory '/usr/local/OSG_1_0_0/gratia/probe/lsf/libexec' is not empty but has a backup

I think I understand what happened here. It turns out to be rather complicated.
In a nutshell, the Gratia-LSF-Probe installed a couple of symlinks to '.', and
they caused the VDT uninstaller to fail the second time the package was updated.

As discussed out of the ticket, Chris is going to repackage the Gratia LSF probe
to not have the dubious symlinks, and I'm going to fix the VDT to not fail even
if they are present. That should cover it. If all goes well, the VDT fix will
be released tomorrow, and the Gratia updates will be picked up soon.

-- Tim
CC: Alan.Sill@ttu.edu
Subject: Re: [vdt-support #4897] LSF probe update failed
Date: Mon, 02 Mar 2009 14:43:43 -0600
To: vdt-support@OPENSCIENCEGRID.ORG
From: Chris Green <greenc@fnal.gov>
Download (untitled) / with headers
text/plain 954b
Tim Cartwright via RT wrote:
> I think I understand what happened here. It turns out to be rather complicated.
> In a nutshell, the Gratia-LSF-Probe installed a couple of symlinks to '.', and
> they caused the VDT uninstaller to fail the second time the package was updated.
>
> As discussed out of the ticket, Chris is going to repackage the Gratia LSF probe
> to not have the dubious symlinks, and I'm going to fix the VDT to not fail even
> if they are present. That should cover it. If all goes well, the VDT fix will
> be released tomorrow, and the Gratia updates will be picked up soon.
>
Note that this is almost certainly a problem on the PBS probe too since
they come from the same RPM.

Thanks,
Chris.
> -- Tim
>
>


--
Chris Green <greenc@fnal.gov>, FNAL CD/SCF/GRID; 'phone (630) 840-2167.
IRC: greenc@jabber.fnal.gov, ChrisGreen@jabber.dsd.lbl.gov;
chissgreen (AIM, Yahoo); chissg@hotmail.com (MSNM);
chris.h.green (Google Talk).
Download smime.p7s
application/x-pkcs7-signature 3.9k
Subject: [vdt-support #4897] SVN commit, rev 8792
To: vdt-support@cs.wisc.edu
From: cat@cs.wisc.edu
Download (untitled) / with headers
text/plain 285b
Commit comment:
Fixed vdt-uninstall to always remove a symlink to '.'. The motivation for this
change is documented in the RT ticket.


Changed files:
U vdt/branches/vdt-1.10.1/VDT-Core/vdt/sbin/vdt-uninstall

To generate a diff:
svn diff -c 8792 file:///p/condor/workspaces/vdt/svn