Documentation Source Text

Artifact Content
Login

Artifact b6d51e9e9160762fc009e266382fc0613b5f372beadcbf5b5f2555de5dea4094:


<title>Why SQLite Does Not Use Git</title>

<table_of_contents>

<h1>Introduction</h1>

<p>
SQLite does not use the
[https://git-scm.org|Git] version control system.
SQLite uses
[https://fossil-scm.org/|Fossil] instead, which is a
version control system that was specifically designed
and written to support SQLite.

<p>
People often wonder why SQLite does not use the
[https://git-scm.org|Git] version control system like everybody
else.
This article attempts to answer that question.  Also,
in <a href="#getthecode">section 3</a>, 
this article provides hints to Git users
about how they can easily access the SQLite source code.

<p>
This article is <u>not</u> a comparison between Fossil
and Git.  See
[https://fossil-scm.org/fossil/doc/trunk/www/fossil-v-git.wiki]
for one comparison of the two systems.  There are others as
well.

<p>
This article is <u>not</u> advocating that you switch your projects
away from Git.  You can use whatever version control system you want.
If you are perfectly happy with Git, then by all means keep using
Git.  But, if you are wondering if there isn't something better,
then maybe try to understand the perspectives presented below.
Use the insights thus obtained to find or write a different and
better version control system, or to just make
improvements to Git itself.

<h2>Edits</h2>

<p>
This article has been revised multiple times in an attempt
to improve clarity, address concerns and misgivings,
and to fix errors identified on
[https://news.ycombinator.com/item?id=16806114|Hacker News],
[https://www.reddit.com/r/programming/comments/8c2niw/why_sqlite_does_not_use_git/|Reddit]
and
[https://lobste.rs/s/slcntl/why_sqlite_does_not_use_git|Lobsters].
The complete edit history can be seen at
[https://sqlite.org/docsrc/finfo/pages/whynotgit.in].
(Usage hint: Click on any two nodes of the graph for a diff.)

<h1>A Few Reasons Why SQLite Does Not Use Git</h1>

<h2>Git does not provide good situational awareness</h2>

<p>
When I want to see what has been happening on SQLite (or any of
about a dozen other projects that I work on) I visit the
[https://sqlite.org/src/timeline|timeline] and in a single
screen I can see a quick summary of all the latest changes,
on all branches.
In a few clicks, I can drill down to see as much detail as I
want.  I can even do this from a phone.

<p>
GitHub and GitLab offer nothing comparible.  The closest I have
found is the [https://github.com/sqlite/sqlite/network|network],
which is slow to render (unless it is already cached), does not 
offer nearly as much details, and scarcely works on mobile.
The [https://github.com/sqlite/sqlite/commits/master|commits] view
of GitHub provides more detail, renders quickly,
and works on mobile, but only shows a single branch at a time,
so I cannot easily know if I've seen all of the recent changes.
And even if GitHub/GitLab did offer better interfaces, both are
third-party services.  They are not a core part of Git.  Hence,
using them introduces yet another dependency into the project.

<p>
I am told that Git users commonly install third-party graphical
viewers for Git, many of which do a better job of showing recent 
activity on the project.  That is great, but these are still
more third-party applications that must be installed and
managed separately.  Many are platform-specific.  (One of the
better ones, [https://gitup.co/|GitUp], only works on Mac, for
example.)  All require that you first sync your local repository
then bring up their graphical interface on your desktop.  And
even with all that, I still cannot see what I typically want to 
see without multiple clicks.  Checking on project status from
a phone while away from the office is not an option.

<h2>Git makes it difficult to find successors (descendents)
of a check-in</h2>

<p>
Git allows you to go backwards in time easily.  Given the latest
check-in on a branch, Git lets you see all the ancestors of that
check-in.  But Git makes it difficult to move in the other
direction.  Given some historical check-in, it is quite challenging
in Git to find out what came next.  It can be done, but it is sufficiently
difficult that people rarely do it.  Common interfaces for Git,
such as GitHub, do not support the ability.

<p>It is not impossible to find the descendents of a check-in
in Git.  It is merely difficult.  For example,
there is a 
[https://stackoverflow.com/questions/27960605/find-all-the-direct-descendants-of-a-given-commit#27962018|stackoverflow page]
showing the command sequence for finding the descendents of a check-in
in unix:

<codeblock>
git rev-list --all --parents | grep "^.\{40\}.*<PARENT_SHA1>.*" | awk '{print $1}'
</codeblock>

<p>
This command sequence is a lot to memorize and type.  (One would want
to create a bash alias or short shell script if it were used frequently.)
Furthermore, it is not quite the same thing.  The command above gives
one a list of descendents without showing the branching structure, which
is important for understanding what happened.
In contrast, Fossil offers displays such as
[https://sqlite.org/src/timeline?d=8a439a6dda390d74&n=15], which
is a tremendous help in analyzing the aftermath of historical changes.

<p>
And it is not really about just finding the descendents of a check-in
from time to time.  The fact that descendents are readily available in
Fossil means that the information pervades the web pages provided by
Fossil.  One example: Every Fossil check-in information page
([https://www.sqlite.org/src/info/ec7addc87f97bcff|example]) shows
a small "Context" graph of the immediate predecessor and successors 
to that check-in.  This helps the user maintain better situational
awareness, and it provides useful capabilities, such as the ability
click forward to the next check-in in sequence.  Another example:
Fossil easily shows the context around a specific check-in
([https://www.sqlite.org/src/timeline?c=2018-03-16&n=10|example])
which again helps to promote situational awareness and a deeper
understanding of what is happening in the code.

<p>
All of the above is possible with Git, given the right extensions
and tools and using the right commands.  But it is not easy to do,
and so it rarely gets done.  Consequently, developers have less awareness
of what is happening in the code.

<h2>The mental model for Git is needlessly complex</h2>

<p>
The complexity of Git
distracts attention from the software under development.  A user of Git
needs to keep all of the following in mind:
<ol type='a'>
<li> The working directory
<li> The "index" or staging area
<li> The local head
<li> The local copy of the remote head
<li> The actual remote head
</ol>
<p>
Git commands (and/or options on commands) for moving and
comparing content between all of these locations. 

<p>In contrast,
Fossil users only need to think about their working directory and
the check-in they are working on.  That is 60% less distraction.
Every developer has a finite number of brain-cycles.  Fossil
requires fewer brain-cycles to operate, thus freeing up 
intellectual resources to focus on the software under development.

<p>One user of both Git and Fossil
[https://news.ycombinator.com/item?id=16806955|writes in HN]:

<blockquote><i>
Fossil gives me peace of mind that I have everything ... synced to 
the server with a single command....
I never get this peace of mind with git.
</i></blockquote>

<h2>Git does not track historical branch names</h2>

<p>
Git keeps the complete DAG of the check-in sequence.  But branch
tags are local information that is not synced and not retained
once a branch closes.
This makes review of historical
branches tedious.

<p>
As an example, suppose someone (perhaps a customer) asks you:
"What ever became of that 'prefer-coroutine-sort-subquery' branch
from two years ago?"
You might try to answer the query by consulting the history in
your version control system, thusly:

<ul>
<li><b>GitHub:</b> [https://github.com/sqlite/sqlite/commits/prefer-coroutine-sort-subquery]
<li><b>Fossil:</b> [https://sqlite.org/src/timeline?r=prefer-coroutine-sort-subquery]
</ul>

<p>
The Fossil view clearly shows that the branch was eventually merged back into
trunk.  It shows where the branch started, and it shows two occasions where changes
on trunk were merged into the branch.  GitHub shows none of this.  In fact, the
GitHub display is mostly useless in trying to figure out what happened.

<p>
Many readers have recommended various third-party GUIs for Git that
might do a better job of showing historical development activity.  Maybe
some of them do work better than native Git and/or GitHub, though they
will all be hampered by the fact that Git does not preserve historical
branch names across syncs.  And even if those other tools are better,
the fact that it is necessary to go to a third-party tool to get the information
desired does not speak well of the core system.

<h2>Git requires more administrative support</h2>

<p>
Git is complex software.
One needs an installer of some kind to put Git on a developer
workstation, or to upgrade to a newer version of Git.
Setting up a Git server is non-trivial, and so most users
have to use a third-party service such as GitHub or GitLab,
and thus introduce additional (unnecessary) dependencies
into the project.

<p>
In contrast, Fossil is a single standalone binary which is
installed by putting it on $PATH.  That one binary contains all
the functionality of core Git and also GitHub and/or GitLab.  It
manages a community server with wiki, bug tracking, and forums, 
provides packaged downloads for consumers, login managements, 
and so forth, with no extra software required.

<p>
Less administration means that programmers spend more time working
on the software (SQLite in this case) and less time fussing with
the version control system.

<h2>Git provides a poor user experience</h2>

<p>The following [https://xkcd.com/1597/] cartoon is an
exaggeration, yet hits close to home:

<p>
<img src="https://www.fossil-scm.org/fossil/doc/trunk/www/xkcd-git.gif">

<p>Let's be real.  Few people seriously dispute that Git provides
a suboptimal user experience.  A lot of 
the underlying implementation shows through into the user
interface.  The interface is so bad that there is even a
parody site that generates
[https://git-man-page-generator.lokaltog.net/|fake git man pages].

<p>Designing software is hard.  It takes a lot of focus.
A good version control system should provide the developer with
assistance, not frustration.  Git has gotten better in this
regard over the past decade, but it still has a long way to go.

<a name="getthecode"></a>
<h1>A Git-User's Guide To Accessing SQLite Source Code</h1>

<p>
If you are a devoted Git user, you can still easily access SQLite.  
This section gives some hints on how to do so.

<h2>The Official GitHub Mirror</h2>

<p>
As of 2019-03-20, there is now an 
[https://github.com/sqlite/sqlite|official Git mirror] of the
SQLite sources on GitHub.

<p>The mirror is an incremental export of the 
[https://sqlite.org/src/timeline|canonical Fossil repository] for
SQLite.  A cron-job updates the GitHub repository once an hour.
This is a one-way, read-only code mirror.  No pull requests or 
changes are accepted via GitHub.  The GitHub repository merely copies
the content from the Fossil repository.  All changes are input via
Fossil.

<p>
The hashes that identify check-ins and files on the Git mirror are
different from the hashes in Fossil.  There are many reasons for
this, chief among them that Fossil uses a SHA3-256 hash whereas
Git uses a SHA1 hash.  During export, the original Fossil hash for
each check-in is added as a footer to check-in comments.  To avoid
confusion, always use the original Fossil hash, not the Git hash,
when referring to SQLite check-ins.

<h2>Web Access</h2>

<p>
The [https://sqlite.org/src/timeline|SQLite Fossil Repository] contains links
for downloading  a Tarball, ZIP Archive, or [SQLite Archive] for any
historical version of SQLite.  The URLs for these downloads are
simple and can be incorporated easily into automated tools.  The format is:

<blockquote>
<tt>https://sqlite.org/src/tarball/</tt><i>VERSION</i><tt>/sqlite.tar.gz</tt>
</blockquote>

<p>
Simply replace <i>VERSION</i> with some description of the version to be
downloaded.  The <i>VERSION</i> can be a prefix of the cryptographic hash
name of a specific check-in, or the name of a branch (in which case the
most recent version of the branch is fetched) or a tag for a specific
check-in like "version-3.23.1":

<blockquote>
<tt>https://sqlite.org/src/tarball/version-3.23.1/sqlite.tar.gz</tt>
</blockquote>


<p>To get the latest release, use "release"
for <i>VERSION</i>, like this:

<blockquote>
<tt>https://sqlite.org/src/tarball/release/sqlite.tar.gz</tt>
</blockquote>

<p>
To get the latest trunk check-in, us "trunk" for <i>VERSION</i>:

<blockquote>
<tt>https://sqlite.org/src/tarball/trunk/sqlite.tar.gz</tt>
</blockquote>

<p>
And so forth.
For ZIP archives and SQLite Archives, simply change the "/tarball/" element
into either "/zip/" or "/sqlar/", and maybe also change the name of the
download file to have a ".zip" or ".sqlar" suffix.

<h2>Fossil Access</h2>

<p>
Fossil is easy to install and use.  Here are the steps for unix.
(Windows is similar.)

<ol>
<li>
Download the self-contained Fossil executable from
[https://fossil-scm.org/fossil/uv/download.html] and put the executable
somewhere on your $PATH.
<li><tt>mkdir ~/fossils</tt>
<li><tt>fossil clone https://sqlite.org/src ~/fossils/sqlite.fossil</tt>
<li><tt>mkdir ~/sqlite; cd ~/sqlite</tt>
<li><tt>fossil open ~/fossils/sqlite.fossil</tt>
</ol>

<p>
At this point you are ready to type "<tt>./configure; make</tt>"
(or on Windows with MSVC, "<tt>nmake /f Makefile.msc</tt>").

<p>
To change your checkout to a different version of Fossil use
the "update" command:

<blockquote>
<tt>fossil update </tt><i>VERSION</i>
</blockquote>

<p>
Use "trunk" for <i>VERSION</i> to get the latest trunk version of SQLite.
Or use a prefix of a cryptographic hash name, or the name of some branch
or tag.  See
[https://fossil-scm.org/fossil/doc/trunk/www/checkin_names.wiki] for more
suggestions on what names can be used for <i>VERSION</i>.

<p>
Use the "<tt>fossil ui</tt>" command from within the ~/sqlite checkout to
bring up a local copy of the website.

<p>
Additional documentation on Fossil can be found at
[https://fossil-scm.org/fossil/doc/trunk/www/permutedindex.html]

<p>
Do not be afraid to explore and experiment.
Without a log-in you won't be able to
push back any changes you make, so you cannot damage the project.

<h2>Verifying Source Code Integrity</h2>

<p>
If you need to verify that the SQLite source code that you have is
authentic and has not been modified in any way (perhaps by an adversary)
that can be done using a few simple command-line tools.  At the root 
of the SQLite source tree is a file named "manifest".  The manifest 
file contains the name of every other file in the source tree together 
with either a SHA1 or SHA3-256 hash for that file.  (SHA1 is used for
older files and SHA3-256 for newer files.)  You can write a
script to extract these hashes and verify them against the source code 
files.  The hash name for the check-in is just the SHA3-256 hash of the
"manifest" file itself.

<h1>See Also</h1>

<p>Other pages that talk about Fossil and Git include:
<ul>
<li><p>[https://fossil-scm.org/fossil/doc/trunk/www/fossil-v-git.wiki|Fossil vs. Git]
<li><p>[https://www.fossil-scm.org/fossil/doc/trunk/www/quotes.wiki|What others say about Fossil and Git]
</ul>