Ugarit command-line reference
Your first backup
Think of a tag to identify the filesystem you're backing up. If it's
/home
on the server gandalf, you might call it gandalf-home. If
it's the entire filesystem of the server bilbo, you might just call
it bilbo.
Then from your shell, run (as root):
# ugarit snapshot <ugarit.conf> [-c] [-a] <tag> <path to root of filesystem>
For example, if we have a ugarit.conf
in the current directory:
# ugarit snapshot ugarit.conf -c localhost-etc /etc
Specify the -c
flag if you want to store ctimes in the vault;
since it's impossible to restore ctimes when extracting from an
vault, doing this is useful only for informational purposes, so it's
not done by default. Similarly, atimes aren't stored in the vault
unless you specify -a
, because otherwise, there will be a lot of
directory blocks uploaded on every snapshot, as the atime of every
file will have been changed by the previous snapshot - so with -a
specified, on every snapshot, every directory in your filesystem will
be uploaded! Ugarit will happily restore atimes if they are found in
a vault; their storage is made optional simply because uploading
them is costly and rarely useful.
Exploring the vault
Now you have a backup, you can explore the contents of the
vault. This need not be done as root, as long as you can read
ugarit.conf
; however, if you want to extract files, run it as root
so the uids and gids can be set.
$ ugarit explore ugarit.conf
This will put you into an interactive shell exploring a virtual
filesystem. The root directory contains an entry for every tag; if you
type ls
you should see your tag listed, and within that
tag, you'll find a list of snapshots, in descending date order, with a
special entry current
for the most recent
snapshot. Within a snapshot, you'll find the root directory of your
snapshot under contents
, and the detailts of the snapshot itself in
propreties.sexpr
, and will be able to cd
into
subdirectories, and so on:
> ls localhost-etc/ <tag> > cd localhost-etc /localhost-etc> ls current/ <snapshot> 2015-06-12 22:49:34/ <snapshot> 2015-06-12 22:49:25/ <snapshot> /localhost-etc> cd current /localhost-etc/current> ls log.sexpr <file> properties.sexpr <inline> contents/ <dir> /localhost-etc/current> cat properties.sexpr ((previous . "a140e6dbe0a7a38f8b8c381323997c23e51a39e2593afb61") (mtime . 1434102574.0) (contents . "34eccf1f5141187e4209cfa354fdea749a0c3c1c4682ec86") (stats (blocks-stored . 12) (bytes-stored . 16889) (blocks-skipped . 50) (bytes-skipped . 6567341) (file-cache-hits . 0) (file-cache-bytes . 0)) (log . "b2a920f962c12848352f33cf32941e5313bcc5f209219c1a") (hostname . "ahe") (source-path . "/etc") (notes) (files . 112) (size . 6563588)) /localhost-etc/current> cd contents /localhost-etc/current/contents> ls zoneinfo <symlink> vconsole.conf <symlink> udev/ <dir> tmpfiles.d/ <dir> systemd/ <dir> sysctl.d/ <dir> sudoers.tmp~ <file> sudoers <file> subuid <file> subgid <file> static <symlink> ssl/ <dir> ssh/ <dir> shells <symlink> shadow- <file> shadow <file> services <symlink> samba/ <dir> rpc <symlink> resolvconf.conf <symlink> resolv.conf <file> -- Press q then enter to stop or enter for more... q /localhost-etc/current/contents> ls -ll resolv.conf -rw-r--r-- 0 0 [2015-05-23 23:22:41] 78B/-: resolv.conf key: #f contents: "e33ea1394cd2a67fe6caab9af99f66a4a1cc50e8929d3550" size: 78 ctime: 1432419761.0
As well as exploring around, you can also extract files or directories
(or entire snapshots) by using the get
command. Ugarit
will do its best to restore the metadata of files, subject to the
rights of the user you run it as.
Type help
to get help in the interactive shell.
The interactive shell supports command-line editing, history and tab completion for your convenience.
Extracting things directly
As well as using the interactive explore mode, it is also possible to directly extract something from the vault, given a path.
Given the sample vault from the previous example, it would be possible
to extract the README.txt
file with the following
command:
$ ugarit extract ugarit.conf /Test/current/contents/README.txt
Forking tags
As mentioned above, you can fork a tag, creating two tags that refer to the same snapshot and its history but that can then have their own subsequent history of snapshots applied to each independently, with the following command:
$ ugarit fork <ugarit.conf> <existing tag> <new tag>
Merging tags
And you can also merge two or more tags into one. It's possible to merge a bunch of tags to make an entirely new tag, or you can merge a tag into an existing tag, by having the "output" tag also be one of the "input" tags.
The command to do this is:
$ ugarit merge <ugarit.conf> <output tag> <input tags...>
For instance, to import your classical music collection into your main musical collection, you might do:
$ ugarit merge ugarit.conf my-music my-music classical-music
Or if you want to create a new all-music archive from the archives bobs-music and petes-music, you might do:
$ ugarit merge ugarit.conf all-music bobs-music petes-music
Archive operations
Importing
To import some files into an archive, you must create a manifest file listing them, and their metadata. The manifest can also list metadata for the import as a whole, perhaps naming the source of the files, or the reason for importing them.
The metadata for a file (or an import) is a series of named properties. The value of a property can be any Scheme value, written in Scheme syntax (with strings double-quoted unless they are to be interpreted as symbols), but strings and numbers are the most useful types.
You can use whatever names you like for properties in metadata, but there are some that the system applies automatically, and an informal standard of sorts, which is documented in docs/archive-schema.wiki.
You can produce a manifest file by hand, or use the Ugarit Manifest Maker to produce one for you. You do this by installing it like so:
$ chicken-install ugarit-manifest-maker
And then running it, giving it any number of file and directory names on the command line. When given directories, it will recursively scan them to find all the files contained therein and put them in the manifest; it will not put directories in the manifest, although it is perfectly legal for you to do so when writing a manifest by hand. This is because the manifest maker can't do much useful analysis on a directory to suggest default metadata for them (so there isn't much point in using it), and it's far more useful for it to make it easy for you to import a large number of files individually by referencing the directory containing them.
The manifest is sent to standard output, so you need to redirect it to a file, like so:
$ ugarit-manifest-maker ~/music > music.manifest
You can specify command-line options, as well. -e PATTERN
or --exclude=PATTERN
introduces a glob pattern for files
to exclude from the manifest, and -D KEY=VALUE
or
--define=KEY=VALUE
provides a property to be added to
every file in the manifest (as opposed to an import property, that is
part of the metadata of the overall import). Note that
VALUE
must be double-quoted if it's a string, as per
Scheme value syntax.
One might use this like so:
$ ugarit-manifest-maker -e *.txt -D rating=5 ~/favourite-music > music.manifest
The manifest maker simplifies the writing of manifests for files, by listing the files in manifest format along with useful metadata extracted from the filename and the file itself. For supported file types (currently, MP3 and OGG music files), it will even look inside the file to extract metadata.
The manifest file it generates will contain lots of comments mentioning things it couldn't automatically analyse (such as unknown OGG/ID3 tags, or unknown types of files); and for metadata properties it thinks might be relevant but can't automatically provide, it suggests them with an empty property declaration, commented out. The idea is that, after generating a manifest, you read it by hand in a text editor to attempt to improve it.
The format of a manifest file
Manifest files have a relatively simple format. The are based on
Scheme s-expressions, so can contain comments. From any semicolon (not
in a string or otherwise quoted) to the end of the line is a comment,
and #;
in front of something comments out that something.
Import metadata properties are specified like so:
(KEY = VALUE)
...where, as usual, VALUE
must be double-quoted if it's a
string.
Files to import, with their metadata, are specified like so:
(object "PATH OF FILE TO IMPORT" (KEY = VALUE) (KEY = VALUE)... )
The closing parenthesis need not be on a line of its own, it's conventionally placed after the closing parenthesis of the final property.
Ugarit, when importing the files in the manifest, will add the following properties if they are not already specified:
import-path
- The path the file was imported from
dc:format
- A guess at the file's MIME type, based on the extension
mtime
- The file's modification time (as the number of seconds since the UNIX epoch)
ctime
- The file's change time (as the number of seconds since the UNIX epoch)
filename
- The name of the file, stripped of any directory components, and including the extension.
The following properties are placed in the import metadata, automatically:
hostname
- The hostname the import was performed on.
manifest-path
- The path to the manifest file used for the import.
mtime
- The time (in seconds since the UNIX epoch) at which the import was committed.
stats
- A Scheme alist of statistics about the import (number of files/blocks uploaded, etc).
So, to wrap that all up, here's a sample import manifest file:
(notes = "A bunch of old CDs I've finally ripped") (object "/home/alaric/newrip/track01.mp3" (filename = "track01.mp3") (dc:format = "audio/mpeg") (dc:publisher = "Go! Beat Records") (dc:created = "1994") (dc:contributor = "Portishead") (dc:subject = "Trip-Hop") (superset:size = 1) (superset:index = 1) (set:title = "Dummy") (set:size = 11) (set:index = 1) (dc:creator = "Portishead") (dc:title = "Wandering Star") (mtime = 1428962299.0) (ctime = 1428962299.0) (file-size = 4703055)) ;;... and so on, for ten more MP3s on this CD, then several other CDs...
Actually importing a manifest
Well, when you finally have a manifest file, importing it is easy:
$ ugarit import <ugarit.conf> <archive tag> <manifest path>
How do I change the metadata of an already-imported file?
That's easy; the "current" metadata of a file is the metadata of its most recent. Just import the file again, in a new manifest, with new metadata, and it will overwrite the old. However, the old metadata is still preserved in the archive's history; tags forked from the archive tag before the second import will still see the original state of the archive, by design.
Exploring
Archives are visible in the explore interface. For instance, an import of some music I did looks like this:
> ls localhost-etc/ <tag> archive-tag/ <tag> > cd archive-tag /archive-tag> ls history/ <archive-history> /archive-tag> cd history /archive-tag/history> ls 2015-06-12 22:53:13/ <import> /archive-tag/history> cd 2015-06-12 22:53:13 /archive-tag/history/2015-06-12 22:53:13> ls log.sexpr <file> properties.sexpr <inline> manifest/ <import-manifest> /archive-tag/history/2015-06-12 22:53:13> cat properties.sexpr ((stats (blocks-stored . 2046) (bytes-stored . 1815317503) (blocks-skipped . 9) (bytes-skipped . 8388608) (file-cache-hits . 0) (file-cache-bytes . 0)) (log . "b2a920f962c12848352f33cf32941e5313bcc5f209219c1a") (mtime . 1434135993.0) (contents . "fcdd5b996914fdcac1e8a6cfbc67663e08f6eaf0cc952e21") (hostname . "ahe") (notes . "A bunch of music, imported as a demo") (manifest-path . "/home/alaric/tmp/test.manifest")) /archive-tag/history/2015-06-12 22:53:13> cd manifest /archive-tag/history/2015-06-12 22:53:13/manifest> ls 1d4269099189234eefeb80b95370eaf280730cf4d591004d:03 The Lemon Song.mp3 <file> 7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382:04 Dazed and Confused.mp3 <file> 64092fa12c2800dda474b41e5ebe8c948f39a59ee91c120b:09 How Many More Times.mp3 <file> 1d79148d1e1e8947c50b44cf2d5690588787af328e82eeef:2-07 Going to California.mp3 <file> e3685148d0d12213074a9fdb94a00e05282aeabe77fa60d5:1-01 You Shook Me.mp3 <file> d73904f371af8d7ca2af1076881230f2dc1c2cf82416880a:03 Strangers.mp3 <file> 9c5a0efb7d397180a1e8d42356d8f04c6c26a83d3b05d34a:09 Uptight.mp3 <file> 01a069aec2e731e18fcdd4ecb0e424f346a2f0e16910f5e9:07 Numb.mp3 <file> 7ea1ab7fbd525c40e21d6dd25130e8c70289ad56c09375b0:08 She.mp3 <file> 009dacd8f3185b7caeb47050002e584ab86d08cf9e9aceec:1-03 Communication Breakdown.mp3 <file> 26d264d629e22709f664ed891741f690900d45cd4fd44326:1-03 Dazed and Confused.mp3 <file> d879761195faf08e4e95a5a2398ea6eefb79920710bfeab6:1-10 Band Introduction _ How Many More Times.mp3 <file> 83244601db42677d110fc8522c6a3cbbc1f22966a779f876:06 All My Love.mp3 <file> 5eebee9a2ad79d04e4f69e9e2a92c4e0a8d5f21e670f89da:07 Tangerine.mp3 <file> dd6f1203b5973ecd00d2c0cee18087030490230727591746:2-08 That's the Way.mp3 <file> c0acea15aa27a6dd1bcaff1c13d4f3d741a40a46abeca3fc:04 The Crunge.mp3 <file> ea7727ad07c6c82e5c9c7218ee1b059cd78264c131c1438d:1-02 I Can't Quit You Baby.mp3 <file> 10fda5f46b8f505ca965bcaf12252eedf5ab44514236f892:14 F.O.D..mp3 <file> a99ca9af5a83bde1c676c388dc273051defa88756df26e95:1-03 Good Times Bad Times.mp3 <file> b5d7cfe9808c7fc0dedbd656d44e4c56159cbd3c2ed963bb:1-15 Stairway to Heaven.mp3 <file> 79c87e3c49ffdac175c95aae071f63d3a9efdf2ddb84998c:08.Batmilk.ogg <file> -- Press q then enter to stop or enter for more... q /archive-tag/history/2015-06-12 22:53:13/manifest> ls -ll 7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382:04 Dazed and Confused.mp3 -r-------- - - [2015-04-13 21:46:39] -/-: 7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382:04 Dazed and Confused.mp3 key: #f contents: "7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382" import-path: "/home/alaric/archive/sorted-music/Led Zeppelin/Led Zeppelin/04 Dazed and Confused.mp3" filename: "04 Dazed and Confused.mp3" dc:format: "audio/mpeg" dc:publisher: "Atlantic" dc:subject: "Classic Rock" dc:title: "Dazed and Confused" dc:creator: "Led Zeppelin" dc:created: "1982" dc:contributor: "Led Zeppelin" set:title: "Led Zeppelin" set:index: 4 set:size: 9 superset:index: 1 superset:size: 1 ctime: 1428957999.0 file-size: 15448903
Searching
However, the explore interface to an archive is far from pleasant. You need to go to the correct import, and find your file by name, and then identify it with a big long name composed of its hash and the original filename to find its properties and extract.
I hope to add property-based searching to explore mode in future
(which is why you need to go into a history
directory
within the archive directory, as other ways of exploring the archive
will appear alongside). This will be particularly useful when the
explore-mode virtual filesystem is mounted over 9P!
However, even that interface, being constrained to look like a
filesystem, will be limited. The ugarit
command-line tool
provides a very powerful search interface that exposes the full power
of the archive metadata.
Metadata filters
Files (and directories) in an archive can be searched for using "metadata filters", which are descriptions of what you're looking for that the computer can understand. They are represented as Scheme s-expressions, and can be made up of the following components:
#t
- This filter matches everything. It's not very useful.
#f
- This filter matches nothing. It's not very useful.
(and FILTER FILTER...)
- This filter matches files for which all of the inner filters match.
(or FILTER FILTER...)
- This filter matches files for which any of the inner filters match.
(not FILTER)
- This filter matches files which do not match the inner filter.
(= ($ PROP) VALUE)
- This filter matches files which have the given
PROP
erty equal to thatVALUE
in their metadata. (= key HASH)
- This filter matches the file with the given hash.
(= ($import PROP) VALUE)
- This filter matches files which have the given
PROP
erty equal to thatVALUE
in the metadata of the import that last imported them.
Searching an archive
For a start, you can search for files matching a given metadata filter in a given archive. This is done with:
$ ugarit search <ugarit.conf> <archive tag> <filter>
For instance, let's look for music by Led Zeppelin:
$ ugarit search ugarit.conf music '(or (= ($ dc:creator) "Led Zeppelin") (= ($ dc:contributor) "Led Zeppelin"))'
The result looks like the explore-mode view of an archive manifest, listing the file's hash followed by its title and extension:
7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382:04 Dazed and Confused.mp3 834a1619a59835e0c27b22801e3c829b40be583dadd19770:2-08 No Quarter.mp3 9e8bc4954838bd9c671f275eb48595089257185750d63894:1-12 I Can't Quit You Baby.mp3 6742b3bebcdd9cae5ec5403c585935403fa74d16ed076cf2:02 Friends (1).mp3 07d161f4bd684e283f7f2cf26e0b732157a8e95ef66939c3:05 Carouselambra.mp3 [...]
What of all our lovely metadata? You can view that if you add the word "verbose" to the end of the command line, which allows you to specify alternate output formats:
$ ugarit search ugarit.conf music '(or (= ($ dc:creator) "Led Zeppelin") (= ($ dc:contributor) "Led Zeppelin"))' verbose
Now the output looks like:
object a444ff6ef807b080b536155f58d246d633cab4a0eabef5bf (ctime = 1428958660.0) (dc:contributor = "Led Zeppelin") (dc:created = "2008") (dc:creator = "Led Zeppelin") [... all the usual file properties omitted ...] import a43f7a7268ee8b18381c20d7573add5dbf8781f81377279c (stats = ((blocks-stored . 2046) (bytes-stored . 1815317503) (blocks-skipped . 9) (bytes-skipped . 8388608) (file-cache-hits . 0) (file-cache-bytes . 0))) (log = "b2a920f962c12848352f33cf32941e5313bcc5f209219c1a") [... all the usual import properties omitted ...] object b4cadf48b2c07ccf0303fc4064b292cb222980b0d4223641 (ctime = 1428958673.0) (dc:contributor = "Led Zeppelin") (dc:created = "2008") (dc:creator = "Led Zeppelin") (dc:creator = "Jimmy Page/John Paul Jones/Robert Plant") [...and so on...]
As you can see, it lists the hash of each file, its metadata, the hash of the import that last imported it, and the metadata of that import.
That's quite verbose, so you'd probably be wanting to take that as
input to another program to do something nicer with it. But it's laid
out for human reading, not for machine parsing. Thankfully, we have
other formats for that, alist
and
alist-with-imports
.
Try this:
$ ugarit search ugarit.conf music '(or (= ($ dc:creator) "Led Zeppelin") (= ($ dc:contributor) "Led Zeppelin"))' alist
This outputs one Scheme s-expression list per match, the first element of which is the hash as a string, the rest of which is an alist of properties:
("7cb253a4886b3e0051ea8cc0e78fb3a0160307a2c37c8382" (ctime . 1428957999.0) (dc:contributor . "Led Zeppelin") (dc:created . "1982") (dc:creator . "Led Zeppelin") [... elided file properties ...] (superset:index . 1) (superset:size . 1)) ("77c960d09eb21ed72e434ddcde0bd3781a4f3d6ee7a6eb66" (ctime . 1428958981.0) (dc:contributor . "Led Zeppelin") [...]
$ ugarit search ugarit.conf music '(or (= ($ dc:creator) "Led Zeppelin") (= ($ dc:contributor) "Led Zeppelin"))' alist-with-imports
This outputs one s-expression per list per match, with four elements. The first is the key string, the second is an alist of file properties, the third is the import's hash, and the last is an alist containing the import's properties. It looks like:
("64fa08a0080aee6ef501c408fd44dfcc634cfcafd8006fc4" ((ctime . 1428958683.0) (dc:contributor . "Led Zeppelin") (dc:created . "2008") (dc:creator . "Led Zeppelin") [... elided file properties ...] (superset:index . 1) (superset:size . 1)) "a43f7a7268ee8b18381c20d7573add5dbf8781f81377279c" ((stats (blocks-stored . 2046) (bytes-stored . 1815317503) [... elided manifest properties ...] (manifest-path . "test.manifest"))) ("4cd56f916a63399b252976e842dcae0b87f058b5a60c93a4" ((ctime . 1428958437.0) (dc:contributor . "Led Zeppelin") [...]
And finally, you might just want to get the hashes of matching files (which are particularly useful for extraction operations, which we'll come to next). To do this, specify a format of "keys", which outputs one line per match, containing just the hash:
$ ugarit search ugarit.conf music '(or (= ($ dc:creator) "Led Zeppelin") (= ($ dc:contributor) "Led Zeppelin"))' keys
ce6f6484337de772de9313038cb25d1b16e28028136cc291 6af5c664cbfa1acb22a377e97aee35d94c0fc003d239dd0c 92e91e79b384478b5aab31bf1b2ff9e25e7e2c4b48575185 6ddb9a41d4968468a904f05ecf7e0e73d2c7c7ad76bc394b a074dddcef67cd93d92c6ffce845894aa56594674023f6e1 4f65f735bbb00a6fda4bc887b370b3160f55e5e07ec37ffa 97cc8b8ba70c39387fc08ef62311b751aea4340d636eb421 72358dbe3eb60da42eadcf6de325b2a6686f4e17ea41fa60 [...]
However, to write filter expressions, you need to know what properties you have available to search on. You might remember, or go for standard properties, or look at existing files in verbose mode to find some; but you can also just ask Ugarit what properties it has in an archive, like so:
$ ugarit search-props <ugarit.conf> <archive tag>
You can even ask what properties are available for files matching an existing filter:
$ ugarit search-props <ugarit.conf> <archive tag> <filter>
This is useful if you're interested in further narrowing down a filter, and so only care about properties that files already matching that filter have.
For a bunch of music files imported with the Ugarit Manifest Maker, you can expect to see something like this:
ctime dc:contributor dc:created dc:creator dc:format dc:publisher dc:subject dc:title file-size filename import-path mtime set:index set:size set:title superset:index superset:size
Now you know what properties to search, next you'll be wanting to know what values to look for. Again, Ugarit has a command to query the available values of any given property:
$ ugarit search-values <ugarit.conf> <archive tag> <property>
And you can limit that just to files matching a given filter:
$ ugarit search-values <ugarit.conf> <archive tag> <filter> <property>
The resulting list of values is ordered by popularity, so the most widely-used values will be listed first. Let's see what genres of music were in my sample of music files I imported:
$ ugarit search-values test.conf archive-tag dc:subject
The result is:
Classic Rock Alternative & Punk Electronic Trip-Hop
Ok, let's now use a filter to find out what artists
(dc:creator
) I have that made Trip-Hop music (what even
IS that?):
$ ugarit search-values test.conf archive-tag \ '(= ($ dc:subject) "Trip-Hop")' \ dc:creator
The result is:
Portishead
Ah, OK, now I know what "Trip-Hop" is.
Extracting
All this searching is lovely, but what it gets us, in the end, is a bunch of file hashes. Perhaps we might want to actually play some music, or look at a photo, or something. To do that, we need to extract from the archive.
We've already seen the contents of an archive in the explore mode
virtual filesystem, so we could go into the archive history, find the
import, go into the manifest, pick the file out there, and use
get
to extract it, but that would be yucky. Thankfully,
we have a command-line interface to get things from archives, in one
of two ways.
Firstly, we can extract a file (or a directory tree) from an archive, out into the local filesystem:
$ ugarit archive-extract <ugarit.conf> <archive tag> <hash> <target>
The "target" is the name to give it in the local filesystem. We could pull out that Led Zeppelin song from our search results above, like so:
$ ugarit archive-extract test.conf archive-tag \ ce6f6484337de772de9313038cb25d1b16e28028136cc291 foo.mp3
We now have a foo.mp3 file in the current directory.
However, sometimes it would be nicer to have it streamed to standard output, which can be done like so:
$ ugarit archive-stream <ugarit.conf> <archive tag> <hash>
This lets us write a command such as:
$ ugarit archive-stream test.conf archive-tag \ ce6f6484337de772de9313038cb25d1b16e28028136cc291 | mpg123 -
...to play it in real time.