ARGON
Check-in [9ed1933bce]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Thoughts on content-addressible storage
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1:9ed1933bcebc686eb9981c1857e927cfb102aeba
User & Date: alaric 2013-06-27 15:42:55
Context
2013-07-16
20:47
XENON added check-in: dd84df30e7 user: alaric tags: trunk
2013-06-27
15:42
Thoughts on content-addressible storage check-in: 9ed1933bce user: alaric tags: trunk
2012-12-06
20:23
More elaboration of CARBON check-in: f22fe263a5 user: alaric tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to README.wiki.

359
360
361
362
363
364
365
366


367
368
369
370











371
372
373
374
375
376
377
378
379
380
381
382
...
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412

413
414
415
416
417

418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437

  *  Have a look in the WOLFRAM docs about the contents of the cluster
     entity, and specify what fraction of it needs to be cached by
     NITROGEN (which is also the subset known to a TUNGSTEN-less
     node). What can we do without? How much RAM will it take? This
     affects the scaling limits of a cluster with simple nodes.

  *  Document MERCURY interfaces to cluster, volume, and node entities.



  *  Give TUNGSTEN a content-addressed archive store, and think about
     how to use it as part of the backup model. Perhaps an entity
     could have read-only sections in its TUNGSTEN store, which live











     in the CAS. This could be a basis for document-style entities to
     be able to provide past versions of themselves via the persona
     field. Voila, version control... How do we provide the semantics
     and workflow of a DVCS, though? An IODINE interface provided by
     document entities to pull and push changes from a clone entity?

  *  Put Disqus or something on this site so people can comment and
     ask questions.

  *  Create Docbook specifications

  *  Create an index of symbols defined in the specs under
................................................................................
     CARBON and TUNGSTEN writings to not assume this as a default
     namespace prefix. There is no default namespace prefix in those
     cases, so using a default-namespace name without declaring one is
     illegal.

<h2>Further research required</h2>

  *  Design a logging framework (PLATINUM). I want all logs integrated at one
     point, then distributed as needed, rather than ending up
     correlating disparate logs. I want logs structured with enough
     metadata to allow finding relationships between events so log
     entries can be "folded up" to hide fine detail (eg, a MERCURY
     incoming request will start off being logged by IRIDIUM then lead
     to a LITHIUM invocation which will lead to CHROME activity and
     then user code calling into WOLFRAM, TUNGSTEN and MERCURY to do
     stuff, then back to MERCURY to IRIDIUM to return a result, then
     WOLFRAM to commit the TUNGSTEN state transaction, etc - but it
     should be foldable up into a single "MERCURY invocation" summary
     event. We don't need CEP systems for this, just threading of
     event IDs down through the system as part of the dynamic
     execution context and appropriate importance levels. Also,

     node-specific events need to be fed into the HYDROGEN console
     system for diagnosis, and a storage system of some kind chosen
     for log events that need recording (with per-handler, per-entity,
     per-volume, per-node and cluster-wide log level settings to
     control storage utilisation, and the ability to remove levels of

     detail in older logs as they are expired). Where are logs stored?
     Within entities, or as a special case with direct TUNGSTEN
     storage? What gets replicated when? Do we log events up to level
     N to the node only, and events up to some (lower) level M get
     replicated in real time because they're important and we need
     them to not be tamperable if a node is compromised, and the level
     N logs on the node get summarised to level X and then replicated
     every day for archival purposes? And who can see the logs? A way
     is needed for logs about a given entity to be made available to
     the administrators of that entity (see also the debug mode
     tracing mechanism in LITHIUM/ARGON, which need to boil down to
     this). This is important and major enough to need an element
     name, I feel!

  *  Should we allow for N-dimensional arrays as well as just vectors
     in IRON? They would allow for better predictive coding for sensor
     data such as images, as we'd be able to take into account pixels
     above as well as to the left of the next one (and the
     generalisation into higher dimensions thereof). If so, update the
     CARBON page's image storage example.







|
>
>

|
|
<
>
>
>
>
>
>
>
>
>
>
>
|
|
|
|
|







 







|
|











|
>
|
|
|
|
|
>
|
|
|
|
|
<
|
|
|
|
|
|
|







359
360
361
362
363
364
365
366
367
368
369
370
371

372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
...
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436

437
438
439
440
441
442
443
444
445
446
447
448
449
450

  *  Have a look in the WOLFRAM docs about the contents of the cluster
     entity, and specify what fraction of it needs to be cached by
     NITROGEN (which is also the subset known to a TUNGSTEN-less
     node). What can we do without? How much RAM will it take? This
     affects the scaling limits of a cluster with simple nodes.

  *  Document CARBON/MERCURY interfaces to cluster, volume, and node
     entities. Think about the services they should offer to
     administrators as well as to users.

  *  Give TUNGSTEN a content-addressed archive storage mode, and think
     about how to use it as part of the backup model. Perhaps groups

     of assertions in TUNGSTEN knowledge bases that do not seem to
     change can be considered eligible for compaction into the CAS for
     efficiency reasons, or perhaps entities should have explicit
     access to immutable CARBON knowledge bases (initialised from a
     mutable KB and identified by hash thereafter). If published KB
     sections are frozen to the CAS somehow, then the very byte
     sequence stored in the CAS can be shipped direct to clients via
     MERCURY, including the potential bittorrent-style protocol
     extension, which would offer significant reduction in the layers
     of software required to service such requests, while still having
     a nice programming model. This could also be a basis for
     document-style entities to be able to provide past versions of
     themselves via the persona field. Voila, version control... How
     do we provide the semantics and workflow of a DVCS, though? An
     IODINE interface provided by document entities to pull and push
     changes from a clone entity?

  *  Put Disqus or something on this site so people can comment and
     ask questions.

  *  Create Docbook specifications

  *  Create an index of symbols defined in the specs under
................................................................................
     CARBON and TUNGSTEN writings to not assume this as a default
     namespace prefix. There is no default namespace prefix in those
     cases, so using a default-namespace name without declaring one is
     illegal.

<h2>Further research required</h2>

  *  Design a logging framework (PLATINUM). I want all logs integrated
     at one point, then distributed as needed, rather than ending up
     correlating disparate logs. I want logs structured with enough
     metadata to allow finding relationships between events so log
     entries can be "folded up" to hide fine detail (eg, a MERCURY
     incoming request will start off being logged by IRIDIUM then lead
     to a LITHIUM invocation which will lead to CHROME activity and
     then user code calling into WOLFRAM, TUNGSTEN and MERCURY to do
     stuff, then back to MERCURY to IRIDIUM to return a result, then
     WOLFRAM to commit the TUNGSTEN state transaction, etc - but it
     should be foldable up into a single "MERCURY invocation" summary
     event. We don't need CEP systems for this, just threading of
     event IDs down through the system as part of the dynamic
     execution context and appropriate importance levels. Log events
     should be represented in CARBON, so using these links shouldn't
     be hard. Also, node-specific events need to be fed into the
     HYDROGEN console system for diagnosis, and a storage system of
     some kind chosen for log events that need recording (with
     per-handler, per-entity, per-volume, per-node and cluster-wide
     log level settings to control storage utilisation, and the
     ability to remove levels of detail in older logs as they are
     expired). Where are logs stored?  Within entities, or as a
     special case with direct TUNGSTEN storage? What gets replicated
     when? Do we log events up to level N to the node only, and events
     up to some (lower) level M get replicated in real time because
     they're important and we need them to not be tamperable if a node

     is compromised, and the level N logs on the node get summarised
     to level X and then replicated every day for archival purposes?
     And who can see the logs? A way is needed for logs about a given
     entity to be made available to the administrators of that entity
     (see also the debug mode tracing mechanism in LITHIUM/ARGON,
     which need to boil down to this). This is important and major
     enough to need an element name, I feel!

  *  Should we allow for N-dimensional arrays as well as just vectors
     in IRON? They would allow for better predictive coding for sensor
     data such as images, as we'd be able to take into account pixels
     above as well as to the left of the next one (and the
     generalisation into higher dimensions thereof). If so, update the
     CARBON page's image storage example.