Minimalist Online Docmentation |
|
![]() |
![]() This is an area where I keep my personal notes while I'm working on a solution to a problem. These aren't in any order, and aren't necessarily consistent with any implementation of MOD, much less the current version.
File and directory orderOK, wanting to reorganize the tutorial has made me realize that not having any way to specify the order of files in a directory is a big PITA. I absolutely do not want to rely on something outside of the file itself (such as a file in the directory dictating the order, or tags in a document indicating what the next one should be) to determine this, since it's far too kludgy for my taste; no maintainability, and it goes against the philosophy of simplicity that's been carried out pretty well so far. I can see a couple possibilities:
In either case, the question remains of whether to mix files/directories. Does a directory "3/" come between files "2.mod" and "4.mod", or after them? Very good points for either case. All in all, nothing strikes me yet as a particularly good solution. So far, I'm still committed to the ideal that all of the information necessary to express the index layout can be intuitively represented by the filesystem itself. Here's a shot at a ruleset:
An alternate depiction:
+- <-- files that contain =me_first | numbered files & directories +- | uncategorized files +- <-- files that contain =me_last | uncategorized directories +- Note that this is identical to the current behavior except for the insertion of the new "numbered files & directories" item. Perhaps a better depection would be:
+- | files that contain =me_first | <-- numbered files & directories files < uncategorized files | | files that contain =me_last +- dirs < uncategorized directories +- This would be an awful lot cleaner if directories couldn't be numbered, since it intruduces the inconsistency of moving directories into the previously exclusive file space, but being able to use them as containers in an ordered hierarchy seems just too attractive. Perhaps it should just be encouraged that either one technique (=me_first, =me_last) or the other (numbered files & dirs) be used, but both can confusing. An additional tag should probably be added that allows the number to be hidden in the index (=hide_number?). Here's another thought; is it =me_first and =me_last that are introducing the confusion? Is there some way to do away with them? Doing away with =me_first would be easy, since adding a low number would do the same job. No such luck with =me_last. One more problem; none of this address the very reasonable desire to want things in alphabetical order. OK, new tack. How about a config file option? (default_sort = name|type|both ?)
Keep flushing this out; there's some way of working this to allow me_first/last control for directories as well; the "type" sort method probably isn't necessary, since it's covered by "both". There's a better name than default_sort that will make me_first/list behaviour obvious, especially if they're changed to something like the tag "=this_file_first", and the file "=this_dir_first". So, the so-called default_sort option really wouldn't control sort method at all (it'd all be alphanumeric), it would only control the "mixture" of files & dirs. What to call it? files called =mingled_dirs & =unmingled dirs? First step is to implement sorted unmingled_dirs as the new default behavior, and to change me_first/last to this_file_first/last. From there we can look at adding control for mingled dirs and this_folder_first/last. The mingled_dirs option might need to be in the config file (rather than being able to control it at the directory level), so that we know how to sort nodes before we start the find(); otherwise we could be partway through a directory and suddenly discover that we're supposed to be sorting things in another way.
Reexamination of naming
File tags
Template variables
Config file variables
File extensions
File and directory names
Directory links?Should directories in the index be links? Only if there's an index.html? My initial reaction was that yes, they should be a link if there's an index.html file in that directory. After all, it would be a valid URL. I decided against it when I thought about what the "you are here" icon would do: you don't want it to point to the directory after you've clicked on it, since that wouldn't really be the file that it's displaying. Since it's going to point to some other file, this means that in terms of UI design, someone clicks on a target, and visually they are taken somewhere _other_ than where they clicked -- that arrow is going to jump down the list, potentially several items away if index.mod doesn't contain a =me_first tag. I didn't really find that disobedience palatable in terms of user interface design, and I couldn't come up with an alternate behavior for that arrow. So, directories are not links. Perhaps in future versions the index will be expandable and collapsable in a way that does a good job of clearly conveying the change in state. That would be an acceptable directory link behavior.
Source & dest file organizationThere's a certain ugliness to the fact that the source tree is all human-maintained, but that the destination tree will probably contain a mixture of auto-generated html and human-maintained images, html and data files. This will require some more thought, as I'm not sure if this is solely a function of the webmaster's organization, or whether any design changes could be made to mod2html to facilitate good separation of manual and automatic content. In my mind, the cleanest organization would be:
www +-images $srcdir = /www/src; +-data $destdir = /www/html; +-src | +-dir1 | +-dir2 | +-dir3 +-html +-dir1 +-dir2 +-dir3 Or perhaps this modification (making sure none of the directories directly under src are called "images" or "data"), which gets rid of the redundant "html/" in the URLs:
| +-src | +-dir1 $srcdir = /src; | +-dir2 $destdir = /www; | +-dir3 +-www +-images +-data +-dir1 +-dir2 +-dir3 This means that 100% of the content in images, data and src are human-maintained, and 100% of the data in html is generated, so you could blow the html directory away at any time. It's also possible to simply blend the two trees, where $srcdir == $destdir. This creates a cleaner presentation from the end viewer's perspective, and allows more flexibility in web site design, but it requires much more attention to maintain, since mistakes (such as accidentally overwriting a file, or making changes to an automatically generated file which are subsequently lost) are far more likely.
Partial update strategyJust some random notes I took while figuring out how to implement and use mod.status. Not going to use md5sum, since rebuilding a file really isn't that expensive, but doing the sums could be. In addition, being able to touch a file to have it rebuilt is really the best interface, and doesn't have an equivalent with md5sum. keep 2 hashes (old & new) that contain filename => mtime need to recreate all dest files if any are true:
allowed to skip a source file if all are true:
we're allowed to delete a destination file if all are true:
don't update the status if you're just creating a tar file
Update to partial updatesThe desire to maintain multiple configurations for the same source illustrated that a single mod.status file was going to be insufficient for partial updates. Here are some additional notes on partial updates that detail the status file's redesign and reimplementation. mod.status definitely needs to keep the names of the mod files themselves, since we need to know the difference between a modified file and a newly created file. Timestamp is less important, since the timestamp of the status file itself could represent the time of last update. The problem arises when we have one source with multiple config files; each destination should have its own set of status info to be done properly. Destinations that appear on the command line don't use any status info, so they're not a problem. The status info kept for each dest should include template and config file info as well, since these could change name or mtime, and the tree should be subsequently recreated. So, should all status info continue to be maintained in a single file with multiple sections for each destination (unlikely, too cumbersome to program), or should we have separate status files for each dest. The problem with the latter is:
Could we detect "stale" status files? Stale status file occurs when a config file changes its destination. If status files are "keyed" on destination, we wouldn't have any way of knowing what the previous destination was. This is very unfortunate, since I really don't like the idea of garbage left behind. In summary: One status file per destination. It contains filenames only, of all the source files (mod and txt), the config file, and the templates. Its mtime represents the last update time for that destination. Here are some possible scenarios for a given source tree and how we'll handle them:
Random thoughtsShould verbose warn about files missing =description or =meta tags? Maybe this is an example for a "-pedantic" command line option?
Control file reorganizationThe introduction of support for multiple config files introduced multiple status files and probably multiple templates as well. Unfortunately, this is really starting to clutter the root of the srcdir. I'm currently thinking about moving everything into a $srdir/.mod directory: $srcdir/ | +-.mod/ | | | +-config | +-default.tmpl | +-status/ | | | +-mod.status.49140baa019934cf8c961b2ce886ae38 | +-mod.status.34d9583cd49c584ef30958340b580945 +-index.mod +-cray/ | | | +- index.mod ...This would keep the srcdir nice and clean, and still hide the unfriendly status files. The default config file location would go from "mod.conf" to ".mod/config" (in fact, there's no reason why it couldn't look for both, and just prefer ".mod/config"). Multiple config files and templates could be kept in the .mod directory as well. In fact, there would be nothing forcing anyone to use this setup (except for the status files) since the config file can be specified on the command line, and the template can be specified in the config file. To simplify these changes in the future, I should probably have a hash devoted to pathnames.
TemplatesOn the subject of reorganization, I think it could be really helpful to have more flexibility for specifying templates. How about the following set of rules for determining which template to apply to a given file "filename.mod" or "filename.txt":
It would probably be easiest to apply these rules during the find() operation, and store each node's template in the node attributes. This way the modifications to second_pass() should be relatively minor. Given a complete node, a subroutine shouldn't have too much trouble figuring out what its template should be.
GuidanceI read the following quote while reading Yvon Chouinard's essay The Next Hundred Years (which is an incredibly worthwhile read, go, read it now!). I thought it represented my goals for this project as well.
On the same note, I recently came across Almost Free Text, which seems to
have similar goals to MOD, but only for one document at a time. For that
purpose, it accomplishes its goals far better than MOD does (no tags at all!).
Good role model.
|