Skip to content

Git Integration

The git module provides:

  • Auto-commit + deferred push for write operations (via on_write)
  • Periodic pull (ff-only) primitives used by the server to keep the working tree up to date

Quick Start

from pathlib import Path
from markdown_vault_mcp import Collection, GitWriteStrategy

strategy = GitWriteStrategy(
    token="ghp_your_token",
    push_delay_s=30,
)

collection = Collection(
    source_dir=Path("/path/to/vault"),
    read_only=False,
    on_write=strategy,
)

# Writes are now auto-committed and pushed
collection.write("notes/new.md", "Hello world")

# Clean up on shutdown
collection.close()

API Reference

GitWriteStrategy(token=None, username='x-access-token', repo_url=None, managed=False, enable_pull=True, enable_push=True, push_delay_s=30.0, commit_name=None, commit_email=None, git_lfs=True, repo_path=None)

Stateful git strategy: commit per write, deferred push.

On each callback invocation:

  1. Stages the changed file (git add or git add -u for deletes).
  2. Commits with an auto-generated message ("operation: path").
  3. Resets the push timer — push fires after push_delay_s of idle.

Push is deferred to a background threading.Timer that resets on each write. When the timer fires (no writes for push_delay_s), all accumulated local commits are pushed in a single git push.

On startup, any unpushed local commits (from a previous crash) are pushed immediately.

Parameters:

Name Type Description Default
token str | None

PAT for HTTPS push via GIT_ASKPASS. None uses SSH or pre-configured credentials.

None
username str

Username used with token auth. Defaults to "x-access-token" (GitHub-compatible).

'x-access-token'
repo_url str | None

Remote URL expected in managed mode.

None
managed bool

When True, ensure the repo exists under repo_path: clone into an empty directory or validate origin on existing repos.

False
enable_pull bool

Enable fetch + ff-only sync methods.

True
enable_push bool

Enable deferred push behavior.

True
push_delay_s float

Seconds of idle before pushing. 0 disables the timer (push only on :meth:close).

30.0
commit_name str | None

Git committer name; defaults to :attr:DEFAULT_COMMIT_NAME.

None
commit_email str | None

Git committer email; defaults to :attr:DEFAULT_COMMIT_EMAIL.

None
git_lfs bool

When True (default), run git lfs pull during lazy initialisation so LFS pointers are resolved before the first write is committed. Requires git-lfs to be on PATH; failures are logged at ERROR and never propagated.

True
repo_path Path | None

Optional repository path used for startup validation. When set together with token, startup raises :class:~markdown_vault_mcp.exceptions.ConfigurationError if origin uses SSH transport instead of HTTPS.

None

Example::

strategy = GitWriteStrategy(token="ghp_...", push_delay_s=30)
collection = Collection(on_write=strategy, ...)
# ... writes happen, push deferred ...
strategy.close()  # final flush
Source code in src/markdown_vault_mcp/git.py
def __init__(
    self,
    token: str | None = None,
    username: str = "x-access-token",
    repo_url: str | None = None,
    managed: bool = False,
    enable_pull: bool = True,
    enable_push: bool = True,
    push_delay_s: float = 30.0,
    commit_name: str | None = None,
    commit_email: str | None = None,
    git_lfs: bool = True,
    repo_path: Path | None = None,
) -> None:
    # Token is retained for GIT_ASKPASS credential forwarding in subprocesses.
    # This pattern is intentionally accepted and suppressed in CodeQL config.
    self._token = token
    self._username = username
    self._repo_url = repo_url
    self._managed = managed
    self._enable_pull = enable_pull
    self._enable_push = enable_push
    self._push_delay_s = push_delay_s
    self._commit_name = commit_name or self.DEFAULT_COMMIT_NAME
    self._commit_email = commit_email or self.DEFAULT_COMMIT_EMAIL
    self._git_lfs = git_lfs
    self._git_root: Path | None = None
    self._git_root_checked = False
    self._write_init_done = False
    self._push_pending = False
    self._timer: threading.Timer | None = None
    self._lock = threading.Lock()
    self._closed = False
    self._pull_stop = threading.Event()
    self._pull_thread: threading.Thread | None = None
    self._pull_interval_s: int = 0
    self._pull_repo_path: Path | None = None
    self._pause_writes: (
        Callable[[], contextlib.AbstractContextManager[None]] | None
    ) = None
    self._on_pull: Callable[[], None] | None = None
    if repo_path is not None:
        if self._managed:
            self._ensure_managed_repo(repo_path)
        else:
            self.validate_startup(repo_path)

__call__(path, content, operation)

WriteCallback interface: stage + commit, then schedule push.

Source code in src/markdown_vault_mcp/git.py
def __call__(
    self,
    path: Path,
    content: str,  # noqa: ARG002
    operation: Literal["write", "edit", "delete", "rename"],
) -> None:
    """WriteCallback interface: stage + commit, then schedule push."""
    if self._closed:
        return

    self._ensure_git_root(path)
    if self._git_root is None:
        logger.debug(
            "No git repository found for %s; git operations disabled", path
        )
        return

    self._ensure_write_init()

    if self._git_root is None:
        return

    try:
        with self._lock:
            _stage_and_commit(
                self._git_root,
                path,
                operation,
                commit_name=self._commit_name,
                commit_email=self._commit_email,
            )
        if self._enable_push:
            self._schedule_push()
    except subprocess.CalledProcessError as exc:
        sanitized_stderr = exc.stderr or ""
        if self._token and self._token in sanitized_stderr:
            sanitized_stderr = sanitized_stderr.replace(self._token, "***")
        logger.error(
            "Git operation failed for %s (%s): command %s returned %d\n%s",
            path,
            operation,
            exc.cmd,
            exc.returncode,
            sanitized_stderr,
        )
    except Exception:
        logger.error(
            "Git operation failed for %s (%s)",
            path,
            operation,
            exc_info=True,
        )

sync_once(repo_path)

Fetch and update once, returning True if HEAD advanced.

Tries fast-forward first; falls back to rebase when the local and upstream branches have diverged (e.g. Obsidian and MCP both committed on different files). Aborts on true conflicts.

Source code in src/markdown_vault_mcp/git.py
def sync_once(self, repo_path: Path) -> bool:
    """Fetch and update once, returning True if HEAD advanced.

    Tries fast-forward first; falls back to rebase when the local
    and upstream branches have diverged (e.g. Obsidian and MCP both
    committed on different files).  Aborts on true conflicts.
    """
    if self._closed or not self._enable_pull:
        return False

    git_root = self._ensure_git_root(repo_path)
    if git_root is None:
        return False

    env = None
    try:
        env = self._git_env()
        with self._lock:
            upstream_check = subprocess.run(
                [
                    "git",
                    "-C",
                    str(git_root),
                    "rev-parse",
                    "--verify",
                    "@{upstream}",
                ],
                capture_output=True,
                text=True,
                env=env,
            )
            if upstream_check.returncode != 0:
                logger.info("Git pull: no upstream configured; skipping fetch")
                return False

            old_head = subprocess.run(
                ["git", "-C", str(git_root), "rev-parse", "HEAD"],
                capture_output=True,
                text=True,
                check=True,
                env=env,
            ).stdout.strip()

            subprocess.run(
                ["git", "-C", str(git_root), "fetch"],
                capture_output=True,
                text=True,
                check=True,
                env=env,
            )

            try:
                subprocess.run(
                    [
                        "git",
                        "-C",
                        str(git_root),
                        "merge",
                        "--ff-only",
                        "@{upstream}",
                    ],
                    capture_output=True,
                    text=True,
                    check=True,
                    env=env,
                )
            except subprocess.CalledProcessError as ff_exc:
                # ff-only failed — the branches have diverged.  Attempt
                # rebase to replay local MCP commits on top of upstream.
                # This handles the common case where Obsidian and the MCP
                # server both committed independently on different files.
                logger.debug(
                    "Git pull: ff-only failed, attempting rebase: %s",
                    (ff_exc.stderr or "").strip(),
                )
                try:
                    subprocess.run(
                        [
                            "git",
                            "-C",
                            str(git_root),
                            "rebase",
                            "@{upstream}",
                        ],
                        capture_output=True,
                        text=True,
                        check=True,
                        env=env,
                    )
                    logger.info(
                        "Git pull: ff-only not possible, rebased local commits onto upstream"
                    )
                except subprocess.CalledProcessError:
                    # True conflict — resolve by accepting theirs and
                    # saving the MCP version as a conflict file.
                    saved = self._resolve_rebase_conflicts(git_root, env)

                    # Check if a rebase is still in progress (e.g. the
                    # loop exited via break because no conflicting files
                    # were found but rebase --continue had returned
                    # non-zero, or the iteration limit was hit).
                    rebase_head = subprocess.run(
                        [
                            "git",
                            "-C",
                            str(git_root),
                            "rev-parse",
                            "--verify",
                            "REBASE_HEAD",
                        ],
                        capture_output=True,
                        text=True,
                        env=env,
                    )
                    rebase_in_progress = rebase_head.returncode == 0

                    if rebase_in_progress:
                        # Abort the incomplete rebase before committing
                        # conflict files so the working tree is clean.
                        abort_proc = subprocess.run(
                            ["git", "-C", str(git_root), "rebase", "--abort"],
                            capture_output=True,
                            text=True,
                            env=env,
                        )
                        if abort_proc.returncode != 0:
                            logger.error(
                                "Git pull: failed to abort rebase: %s",
                                (abort_proc.stderr or "").strip(),
                            )
                        # After abort, the working tree reverts to the
                        # pre-rebase state (MCP commits), so the original
                        # files contain MCP content, not upstream content.
                        # Restore the upstream version for each conflicting
                        # file so _write_conflict_files reads the right side.
                        for rel_path, _ in saved:
                            subprocess.run(
                                [
                                    "git",
                                    "-C",
                                    str(git_root),
                                    "checkout",
                                    "@{upstream}",
                                    "--",
                                    rel_path,
                                ],
                                capture_output=True,
                                text=True,
                                env=env,
                            )

                    if saved:
                        written = self._write_conflict_files(git_root, saved, env)
                        for cf in written:
                            logger.warning(
                                "Git pull: conflict resolved, saved MCP version as %s",
                                cf,
                            )
                        logger.info(
                            "Git pull: rebase completed with %d conflict file(s)",
                            len(written),
                        )
                    else:
                        # Resolution failed entirely — stay put.
                        logger.warning(
                            "Git pull: conflict resolution failed, skipping"
                        )
                        return False

            new_head = subprocess.run(
                ["git", "-C", str(git_root), "rev-parse", "HEAD"],
                capture_output=True,
                text=True,
                check=True,
                env=env,
            ).stdout.strip()

            # Always attempt LFS pull after a successful fetch+ff-only step.
            self._lfs_pull(env=env)

        return old_head != new_head
    except FileNotFoundError:
        logger.info("Git pull: git not found on PATH; pull loop disabled")
        return False
    except subprocess.CalledProcessError as exc:
        logger.warning(
            "Git pull: git command failed, skipping: %s",
            (exc.stderr or "").strip(),
        )
        return False
    finally:
        self._cleanup_git_env(env)

start(*, repo_path, pull_interval_s, pause_writes=None, on_pull=None)

Start a periodic fetch + ff-only update loop in a daemon thread.

Source code in src/markdown_vault_mcp/git.py
def start(
    self,
    *,
    repo_path: Path,
    pull_interval_s: int,
    pause_writes: Callable[[], contextlib.AbstractContextManager[None]]
    | None = None,
    on_pull: Callable[[], None] | None = None,
) -> None:
    """Start a periodic fetch + ff-only update loop in a daemon thread."""
    if self._closed or not self._enable_pull or pull_interval_s <= 0:
        return

    git_root = self._ensure_git_root(repo_path)
    if git_root is None:
        return

    # Guard: do not start the loop if there is no upstream configured.
    # This check is intentionally independent of the sync_once() call in
    # sync_from_remote_before_index() — start() may be called even when
    # the startup sync was skipped (pull_interval_s changed at runtime,
    # or Collection.start() called directly by library users).  The double
    # upstream check is harmless (costs one git subprocess) and avoids
    # noisy "no upstream" logs on every tick.
    env = None
    try:
        env = self._git_env()
        upstream_check = subprocess.run(
            ["git", "-C", str(git_root), "rev-parse", "--verify", "@{upstream}"],
            capture_output=True,
            text=True,
            env=env,
        )
        if upstream_check.returncode != 0:
            logger.info("Git pull: no upstream configured; pull loop disabled")
            return
    except FileNotFoundError:
        logger.info("Git pull: git not found on PATH; pull loop disabled")
        return
    finally:
        self._cleanup_git_env(env)

    with self._lock:
        if self._pull_thread is not None and self._pull_thread.is_alive():
            return
        self._pull_repo_path = repo_path
        self._pull_interval_s = pull_interval_s
        self._pause_writes = pause_writes
        self._on_pull = on_pull
        self._pull_stop.clear()
        self._pull_thread = threading.Thread(
            target=self._pull_loop, name="GitPullLoop", daemon=True
        )
        self._pull_thread.start()

stop()

Stop the pull loop thread if it is running.

Source code in src/markdown_vault_mcp/git.py
def stop(self) -> None:
    """Stop the pull loop thread if it is running."""
    with self._lock:
        thread = self._pull_thread
        if thread is None:
            return
        self._pull_stop.set()
    # Do not block indefinitely on shutdown.
    thread.join(timeout=5.0)
    with self._lock:
        if self._pull_thread is thread:
            self._pull_thread = None

flush()

Block until any pending push completes.

Cancels the idle timer and pushes immediately if there are pending local commits.

Source code in src/markdown_vault_mcp/git.py
def flush(self) -> None:
    """Block until any pending push completes.

    Cancels the idle timer and pushes immediately if there are
    pending local commits.
    """
    with self._lock:
        if self._timer is not None:
            self._timer.cancel()
            self._timer = None
        pending = self._push_pending

    if pending and self._git_root is not None:
        self._do_push_safe()

close()

Cancel timer, flush pending push, mark strategy as closed.

Source code in src/markdown_vault_mcp/git.py
def close(self) -> None:
    """Cancel timer, flush pending push, mark strategy as closed."""
    self._closed = True
    self.stop()
    self.flush()

git_write_strategy(token=None, push_delay_s=0, git_lfs=True)

Create a :class:GitWriteStrategy callback.

Convenience wrapper around :class:GitWriteStrategy. With the default push_delay_s=0, commits happen per-write but push only fires when :meth:~GitWriteStrategy.close or :meth:~GitWriteStrategy.flush is called.

When used via :class:~markdown_vault_mcp.collection.Collection, Collection.close() automatically calls the strategy's close(), so pushes flush on shutdown. Callers using this as a bare WriteCallback must retain a reference and call close() explicitly.

.. deprecated:: Prefer :class:GitWriteStrategy directly for access to :meth:~GitWriteStrategy.flush and :meth:~GitWriteStrategy.close.

.. note:: The default push_delay_s=0 here differs from :class:GitWriteStrategy's default of 30.0. This preserves backward compatibility (push on close/flush only).

Parameters:

Name Type Description Default
token str | None

PAT for HTTPS push.

None
push_delay_s float

Push delay in seconds (default 0 = push on close only).

0
git_lfs bool

When True (default), run git lfs pull during init.

True

Returns:

Name Type Description
A GitWriteStrategy

class:GitWriteStrategy instance (also satisfies

GitWriteStrategy

data:~markdown_vault_mcp.types.WriteCallback).

Source code in src/markdown_vault_mcp/git.py
def git_write_strategy(
    token: str | None = None,
    push_delay_s: float = 0,
    git_lfs: bool = True,
) -> GitWriteStrategy:
    """Create a :class:`GitWriteStrategy` callback.

    Convenience wrapper around :class:`GitWriteStrategy`.  With the
    default ``push_delay_s=0``, commits happen per-write but push only
    fires when :meth:`~GitWriteStrategy.close` or
    :meth:`~GitWriteStrategy.flush` is called.

    When used via :class:`~markdown_vault_mcp.collection.Collection`,
    ``Collection.close()`` automatically calls the strategy's
    ``close()``, so pushes flush on shutdown.  Callers using this
    as a bare ``WriteCallback`` must retain a reference and call
    ``close()`` explicitly.

    .. deprecated::
        Prefer :class:`GitWriteStrategy` directly for access to
        :meth:`~GitWriteStrategy.flush` and :meth:`~GitWriteStrategy.close`.

    .. note::
        The default ``push_delay_s=0`` here differs from
        :class:`GitWriteStrategy`'s default of ``30.0``.  This preserves
        backward compatibility (push on close/flush only).

    Args:
        token: PAT for HTTPS push.
        push_delay_s: Push delay in seconds (default 0 = push on close only).
        git_lfs: When ``True`` (default), run ``git lfs pull`` during init.

    Returns:
        A :class:`GitWriteStrategy` instance (also satisfies
        :data:`~markdown_vault_mcp.types.WriteCallback`).
    """
    return GitWriteStrategy(token=token, push_delay_s=push_delay_s, git_lfs=git_lfs)