Skip Menu | Logged in as guest | Logout
 
Ticket metadata
Id: 3633
Status: resolved
Priority: 3/0
Queue: vdt-support

Fixed in: (no value)
Fix scheduled: CUR

Owner: Alan De Smet
Requestors: Marco Mambelli
Cc:
AdminCc:

New reminder:

Created: Thu Jun 26 18:20:42 2008
Starts: Not set
Started: Not set
Last Contact: Thu Aug 28 17:34:07 2008
Due: Not set
Closed: Tue Sep 02 13:49:43 2008
Updated: Tue Sep 02 13:49:43 2008 by adesmet



History Brief headersFull headers
Subject: slimming some VDT packages
Date: Thu, 26 Jun 2008 18:17:13 -0500 (CDT)
To: vdt-support@OPENSCIENCEGRID.ORG
From: Marco Mambelli <marco@hep.uchicago.edu>
Download (untitled) / with headers
text/plain 296b
Hi,
some packages have multiple copies of the same exact library (mylib.so
mylib.so.1.2.3, ...). These could be simbolic links

The last one I saw is the dcache-client but I saw others before

It's not a bug or a big problem but could help to reduce the size of
future releases.

Cheers,
Marco
Download (untitled) / with headers
text/plain 585b
Hi Marco--

> some packages have multiple copies of the same exact library (mylib.so
> mylib.so.1.2.3, ...). These could be simbolic links
>
> The last one I saw is the dcache-client but I saw others before
>
> It's not a bug or a big problem but could help to reduce the size of
> future releases.

Can you give me a specific example? Do an 'ls -l' on an example?

Thanks,
-alain

-----------------------------------------------------------------
Alain Roy vdt-support@opensciencegrid.org
VDT Support http://vdt.cs.wisc.edu/support.html
Subject: Re: [vdt-support #3633] slimming some VDT packages
Date: Fri, 27 Jun 2008 16:02:57 -0500 (CDT)
To: Alain Roy via RT <vdt-support@OPENSCIENCEGRID.ORG>
From: Marco Mambelli <marco@hep.uchicago.edu>
Download (untitled) / with headers
text/plain 1.1k
Hi Alain,
here is the dccp example:
]$ ls -l /share/wn-client/dccp/lib/
total 5752
-rw-r--r-- 1 root root 271166 Jan 14 13:32 libdcap.so
-rw-r--r-- 1 root root 271166 Jan 14 13:32 libdcap1.2.42.so
-rw-r--r-- 1 root root 4713098 Jan 14 13:32 libgsiTunnel.so
-rw-r--r-- 1 root root 297759 Jan 14 13:32 libpdcap.so
-rw-r--r-- 1 root root 297759 Jan 14 13:32 libpdcap1.2.42.so

wn-client is installed in /share/wn-client
libdcap.so and libpdcap.so could be link to the fully numbered version

Cheers,
Marco


On Thu, 26 Jun 2008, Alain Roy via RT wrote:

> Hi Marco--
>
>> some packages have multiple copies of the same exact library (mylib.so
>> mylib.so.1.2.3, ...). These could be simbolic links
>>
>> The last one I saw is the dcache-client but I saw others before
>>
>> It's not a bug or a big problem but could help to reduce the size of
>> future releases.
>
> Can you give me a specific example? Do an 'ls -l' on an example?
>
> Thanks,
> -alain
>
> -----------------------------------------------------------------
> Alain Roy vdt-support@opensciencegrid.org
> VDT Support http://vdt.cs.wisc.edu/support.html
>
>
>
>
Subject: Re: [vdt-support #3633] slimming some VDT packages
Date: Sun, 29 Jun 2008 17:49:27 -0500
To: vdt-support@OPENSCIENCEGRID.ORG
From: Alain Roy <roy@cs.wisc.edu>
Hi Marco,

Yeah, I see the reason for that. The real problem is that we don't
build dccp from scratch, but extract it from an RPM. Our extraction
process doesn't preserve symlinks.

I'll see if we can fix this in a future update.

Thanks,
-alain

-----------------------------------------------------------------
Alain Roy vdt-support@opensciencegrid.org
VDT Support http://vdt.cs.wisc.edu/support.html

On Jun 27, 2008, at 4:06 PM, marco@hep.uchicago.edu via RT wrote:

> http://vdt.cs.wisc.edu/rt/Ticket/Display.html?id=3633
>
> Hi Alain,
> here is the dccp example:
> ]$ ls -l /share/wn-client/dccp/lib/
> total 5752
> -rw-r--r-- 1 root root 271166 Jan 14 13:32 libdcap.so
> -rw-r--r-- 1 root root 271166 Jan 14 13:32 libdcap1.2.42.so
> -rw-r--r-- 1 root root 4713098 Jan 14 13:32 libgsiTunnel.so
> -rw-r--r-- 1 root root 297759 Jan 14 13:32 libpdcap.so
> -rw-r--r-- 1 root root 297759 Jan 14 13:32 libpdcap1.2.42.so
>
> wn-client is installed in /share/wn-client
> libdcap.so and libpdcap.so could be link to the fully numbered version
Download (untitled) / with headers
text/plain 570b
The problem here is that rpm2cpio | cpio -ivd doesn't preserve symlinks.
I spent a few minutes looking at it and didn't find an obvious way to
preserve symlinks. So I see three solutions to this problem:

1) Find some option to rpm2cpio or cpio that I didn't see that fixes the
problem.
2) Make the symlinks manually after doing the extraction
3) Building from scratch.

I think that #2 is easy.

While it might seem strange that I make this ticket a priority 3 ticket,
I think it's a really quick fix so we should just do it soon. If we
don't, it will linger forever.

cpio is capable storing and restoring symlinks.  In the specific case of dccp, the symlinks are simply not there; they are copies. This was confirmed by doing a root install of the upsteam RPM.

I'll do a bit more poking around to see how common this is, if other packages we convert from RPM to tar.gz do work as expected (with symlinks), and how helpful replacing copies with symlinks would be.

Assuming it's decided that converting identical copies into symlinks is a good idea, we'll need to decide what is identical enough to warrant replacement.  Here's my rough proposal:

* Files must be from the same package.  (Otherwise uninstalling one package could break another.  I also don't expect that cross package links will catch much, if anything.)

* Files must be in the same directory (Probably unnecessary, but erring on the side of caution. Also makes finding identical files faster.)

* Files must be match lib*.so* (We don't want to link other files that may be modified once a service starts.  Limiting this to libraries seems a good solution.)

* The file size must be the same (obviously)

* The date does _not_need to be the same.

* We compare MD5sums, Just In Case. (We could do a direct comparison, but using and caching MD5sums means we can easily compare a bunch of binaries if, say, libfoo.so, libfoo.so.1, and libfoo.so.1.0.1 are all present.)

* We link from libFOO.so to libFOO.so.version; keeping with how it's usually done.

Subject: Re: [vdt-support #3633] slimming some VDT packages
Date: Tue, 26 Aug 2008 21:28:54 -0700
To: vdt-support@OPENSCIENCEGRID.ORG
From: Alain Roy <roy@cs.wisc.edu>
Download (untitled) / with headers
text/plain 1.7k
On Aug 26, 2008, at 4:06 PM, Alan De Smet via RT wrote:
> cpio is capable storing and restoring symlinks. In the specific case
> of dccp,
> the symlinks are simply not there; they are copies. This was
> confirmed by doing
> a root install of the upsteam RPM.

Interesting, I didn't catch that when I first looked at it.

> I'll do a bit more poking around to see how common this is, if other
> packages
> we convert from RPM to tar.gz do work as expected (with symlinks),
> and how
> helpful replacing copies with symlinks would be.

Great, thanks.

> Assuming it's decided that converting identical copies into symlinks
> is a good
> idea, we'll need to decide what is identical enough to warrant
> replacement.
> Here's my rough proposal:
>
> * Files must be from the same package. (Otherwise uninstalling one
> package
> could break another. I also don't expect that cross package links
> will catch
> much, if anything.)
>
> * Files must be in the same directory (Probably unnecessary, but
> erring on the
> side of caution. Also makes finding identical files faster.)
>
> * Files must be match lib*.so* (We don't want to link other files
> that may be
> modified once a service starts. Limiting this to libraries seems a
> good
> solution.)
>
> * The file size must be the same (obviously)
>
> * The date does _not_need to be the same.
>
> * We compare MD5sums, Just In Case. (We could do a direct
> comparison, but using
> and caching MD5sums means we can easily compare a bunch of binaries
> if, say,
> libfoo.so, libfoo.so.1, and libfoo.so.1.0.1 are all present.)
>
> * We link from libFOO.so to libFOO.so.version; keeping with how it's
> usually
> done.

I think that sounds right. For DCCP, it should just be a few files.

Thanks,
-alain

Reviewing a standard VDT install, the only case where this happens is DCCP.  So I'll be special casing our DCCP re-package to catch it, not doing anything clever for the build system as a whole.

Subject: [vdt-support #3633] SVN commit, rev 7974
To: vdt-support@cs.wisc.edu
From: adesmet@cs.wisc.edu
Download (untitled) / with headers
text/plain 340b
Commit comment:

Change copies of Dccp libraries into symlinks.
This is how libaries are usually handled, but
the upstream package didn't. This saves a bit
of space on installs.


Changed files:
U vdt/branches/vdt-1.10.1/Dccp/make-tarball
U vdt/branches/vdt-1.10.1/defs

To generate a diff:
svn diff -c 7974 file:///p/vdt/workspace/svn

Tested, documented in release notes, checked in, made.