Staging
v0.5.1
https://github.com/git/git

sort by:
Revision Author Date Message Commit Date
6eed462 Git 2.18.5 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 12 February 2021, 14:47:43 UTC
9b77cec Sync with 2.17.6 * maint-2.17: Git 2.17.6 unpack_trees(): start with a fresh lstat cache run-command: invalidate lstat cache after a command finished checkout: fix bug that makes checkout follow symlinks in leading path 12 February 2021, 14:47:42 UTC
6b82d3e Git 2.17.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 12 February 2021, 14:47:02 UTC
22539ec unpack_trees(): start with a fresh lstat cache We really want to avoid relying on stale information. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 12 February 2021, 14:47:02 UTC
0d58fef run-command: invalidate lstat cache after a command finished In the previous commit, we intercepted calls to `rmdir()` to invalidate the lstat cache in the successful case, so that the lstat cache could not have the idea that a directory exists where there is none. The same situation can arise, of course, when a separate process is spawned (most notably, this is the case in `submodule_move_head()`). Obviously, we cannot know whether a directory was removed in that process, therefore we must invalidate the lstat cache afterwards. Note: in contrast to `lstat_cache_aware_rmdir()`, we invalidate the lstat cache even in case of an error: the process might have removed a directory and still have failed afterwards. Co-authored-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 12 February 2021, 14:47:02 UTC
684dd4c checkout: fix bug that makes checkout follow symlinks in leading path Before checking out a file, we have to confirm that all of its leading components are real existing directories. And to reduce the number of lstat() calls in this process, we cache the last leading path known to contain only directories. However, when a path collision occurs (e.g. when checking out case-sensitive files in case-insensitive file systems), a cached path might have its file type changed on disk, leaving the cache on an invalid state. Normally, this doesn't bring any bad consequences as we usually check out files in index order, and therefore, by the time the cached path becomes outdated, we no longer need it anyway (because all files in that directory would have already been written). But, there are some users of the checkout machinery that do not always follow the index order. In particular: checkout-index writes the paths in the same order that they appear on the CLI (or stdin); and the delayed checkout feature -- used when a long-running filter process replies with "status=delayed" -- postpones the checkout of some entries, thus modifying the checkout order. When we have to check out an out-of-order entry and the lstat() cache is invalid (due to a previous path collision), checkout_entry() may end up using the invalid data and thrusting that the leading components are real directories when, in reality, they are not. In the best case scenario, where the directory was replaced by a regular file, the user will get an error: "fatal: unable to create file 'foo/bar': Not a directory". But if the directory was replaced by a symlink, checkout could actually end up following the symlink and writing the file at a wrong place, even outside the repository. Since delayed checkout is affected by this bug, it could be used by an attacker to write arbitrary files during the clone of a maliciously crafted repository. Some candidate solutions considered were to disable the lstat() cache during unordered checkouts or sort the entries before passing them to the checkout machinery. But both ideas include some performance penalty and they don't future-proof the code against new unordered use cases. Instead, we now manually reset the lstat cache whenever we successfully remove a directory. Note: We are not even checking whether the directory was the same as the lstat cache points to because we might face a scenario where the paths refer to the same location but differ due to case folding, precomposed UTF-8 issues, or the presence of `..` components in the path. Two regression tests, with case-collisions and utf8-collisions, are also added for both checkout-index and delayed checkout. Note: to make the previously mentioned clone attack unfeasible, it would be sufficient to reset the lstat cache only after the remove_subtree() call inside checkout_entry(). This is the place where we would remove a directory whose path collides with the path of another entry that we are currently trying to check out (possibly a symlink). However, in the interest of a thorough fix that does not leave Git open to similar-but-not-identical attack vectors, we decided to intercept all `rmdir()` calls in one fell swoop. This addresses CVE-2021-21300. Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> 12 February 2021, 14:47:02 UTC
ba6f090 Git 2.18.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:24:14 UTC
df5be6d Git 2.17.5 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:58 UTC
1a3609e fsck: reject URL with empty host in .gitmodules Git's URL parser interprets https:///example.com/repo.git to have no host and a path of "example.com/repo.git". Curl, on the other hand, internally redirects it to https://example.com/repo.git. As a result, until "credential: parse URL without host as empty host, not unset", tricking a user into fetching from such a URL would cause Git to send credentials for another host to example.com. Teach fsck to block and detect .gitmodules files using such a URL to prevent sharing them with Git versions that are not yet protected. A relative URL in a .gitmodules file could also be used to trigger this. The relative URL resolver used for .gitmodules does not normalize sequences of slashes and can follow ".." components out of the path part and to the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https:///attacker.example.com/exploit submodule. Fortunately, redundant extra slashes in .gitmodules are rare, so we can catch this by detecting one after a leading sequence of "./" and "../" components. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net> 19 April 2020, 23:10:58 UTC
e7fab62 credential: treat URL with empty scheme as invalid Until "credential: refuse to operate when missing host or protocol", Git's credential handling code interpreted URLs with empty scheme to mean "give me credentials matching this host for any protocol". Luckily libcurl does not recognize such URLs (it tries to look for a protocol named "" and fails). Just in case that changes, let's reject them within Git as well. This way, credential_from_url is guaranteed to always produce a "struct credential" with protocol and host set. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:58 UTC
c44088e credential: treat URL without scheme as invalid libcurl permits making requests without a URL scheme specified. In this case, it guesses the URL from the hostname, so I can run git ls-remote http::ftp.example.com/path/to/repo and it would make an FTP request. Any user intentionally using such a URL is likely to have made a typo. Unfortunately, credential_from_url is not able to determine the host and protocol in order to determine appropriate credentials to send, and until "credential: refuse to operate when missing host or protocol", this resulted in another host's credentials being leaked to the named host. Teach credential_from_url_gently to consider such a URL to be invalid so that fsck can detect and block gitmodules files with such URLs, allowing server operators to avoid serving them to downstream users running older versions of Git. This also means that when such URLs are passed on the command line, Git will print a clearer error so affected users can switch to the simpler URL that explicitly specifies the host and protocol they intend. One subtlety: .gitmodules files can contain relative URLs, representing a URL relative to the URL they were cloned from. The relative URL resolver used for .gitmodules can follow ".." components out of the path part and past the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https::attacker.example.com/exploit submodule. Fortunately a leading ':' in the first path component after a series of leading './' and '../' components is unlikely to show up in other contexts, so we can catch this by detecting that pattern. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net> 19 April 2020, 23:10:58 UTC
fe29a9b credential: die() when parsing invalid urls When we try to initialize credential loading by URL and find that the URL is invalid, we set all fields to NULL in order to avoid acting on malicious input. Later when we request credentials, we diagonse the erroneous input: fatal: refusing to work with credential missing host field This is problematic in two ways: - The message doesn't tell the user *why* we are missing the host field, so they can't tell from this message alone how to recover. There can be intervening messages after the original warning of bad input, so the user may not have the context to put two and two together. - The error only occurs when we actually need to get a credential. If the URL permits anonymous access, the only encouragement the user gets to correct their bogus URL is a quiet warning. This is inconsistent with the check we perform in fsck, where any use of such a URL as a submodule is an error. When we see such a bogus URL, let's not try to be nice and continue without helpers. Instead, die() immediately. This is simpler and obviously safe. And there's very little chance of disrupting a normal workflow. It's _possible_ that somebody has a legitimate URL with a raw newline in it. It already wouldn't work with credential helpers, so this patch steps that up from an inconvenience to "we will refuse to work with it at all". If such a case does exist, we should figure out a way to work with it (especially if the newline is only in the path component, which we normally don't even pass to helpers). But until we see a real report, we're better off being defensive. Reported-by: Carlo Arenas <carenas@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:58 UTC
a2b26ff fsck: convert gitmodules url to URL passed to curl In 07259e74ec1 (fsck: detect gitmodules URLs with embedded newlines, 2020-03-11), git fsck learned to check whether URLs in .gitmodules could be understood by the credential machinery when they are handled by git-remote-curl. However, the check is overbroad: it checks all URLs instead of only URLs that would be passed to git-remote-curl. In principle a git:// or file:/// URL does not need to follow the same conventions as an http:// URL; in particular, git:// and file:// protocols are not succeptible to issues in the credential API because they do not support attaching credentials. In the HTTP case, the URL in .gitmodules does not always match the URL that would be passed to git-remote-curl and the credential machinery: Git's URL syntax allows specifying a remote helper followed by a "::" delimiter and a URL to be passed to it, so that git ls-remote http::https://example.com/repo.git invokes git-remote-http with https://example.com/repo.git as its URL argument. With today's checks, that distinction does not make a difference, but for a check we are about to introduce (for empty URL schemes) it will matter. .gitmodules files also support relative URLs. To ensure coverage for the https based embedded-newline attack, urldecode and check them directly for embedded newlines. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net> 19 April 2020, 23:10:58 UTC
8ba8ed5 credential: refuse to operate when missing host or protocol The credential helper protocol was designed to be very flexible: the fields it takes as input are treated as a pattern, and any missing fields are taken as wildcards. This allows unusual things like: echo protocol=https | git credential reject to delete all stored https credentials (assuming the helpers themselves treat the input that way). But when helpers are invoked automatically by Git, this flexibility works against us. If for whatever reason we don't have a "host" field, then we'd match _any_ host. When you're filling a credential to send to a remote server, this is almost certainly not what you want. Prevent this at the layer that writes to the credential helper. Add a check to the credential API that the host and protocol are always passed in, and add an assertion to the credential_write function that speaks credential helper protocol to be doubly sure. There are a few ways this can be triggered in practice: - the "git credential" command passes along arbitrary credential parameters it reads from stdin. - until the previous patch, when the host field of a URL is empty, we would leave it unset (rather than setting it to the empty string) - a URL like "example.com/foo.git" is treated by curl as if "http://" was present, but our parser sees it as a non-URL and leaves all fields unset - the recent fix for URLs with embedded newlines blanks the URL but otherwise continues. Rather than having the desired effect of looking up no credential at all, many helpers will return _any_ credential Our earlier test for an embedded newline didn't catch this because it only checked that the credential was cleared, but didn't configure an actual helper. Configuring the "verbatim" helper in the test would show that it is invoked (it's obviously a silly helper which doesn't look at its input, but the point is that it shouldn't be run at all). Since we're switching this case to die(), we don't need to bother with a helper. We can see the new behavior just by checking that the operation fails. We'll add new tests covering partial input as well (these can be triggered through various means with url-parsing, but it's simpler to just check them directly, as we know we are covered even if the url parser changes behavior in the future). [jn: changed to die() instead of logging and showing a manual username/password prompt] Reported-by: Carlo Arenas <carenas@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:58 UTC
2403668 credential: parse URL without host as empty host, not unset We may feed a URL like "cert:///path/to/cert.pem" into the credential machinery to get the key for a client-side certificate. That credential has no hostname field, which is about to be disallowed (to avoid confusion with protocols where a helper _would_ expect a hostname). This means as of the next patch, credential helpers won't work for unlocking certs. Let's fix that by doing two things: - when we parse a url with an empty host, set the host field to the empty string (asking only to match stored entries with an empty host) rather than NULL (asking to match _any_ host). - when we build a cert:// credential by hand, similarly assign an empty string It's the latter that is more likely to impact real users in practice, since it's what's used for http connections. But we don't have good infrastructure to test it. The url-parsing version will help anybody using git-credential in a script, and is easy to test. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:57 UTC
73aafe9 t0300: use more realistic inputs Many of the tests in t0300 give partial inputs to git-credential, omitting a protocol or hostname. We're checking only high-level things like whether and how helpers are invoked at all, and we don't care about specific hosts. However, in preparation for tightening up the rules about when we're willing to run a helper, let's start using input that's a bit more realistic: pretend as if http://example.com is being examined. This shouldn't change the point of any of the tests, but do note we have to adjust the expected output to accommodate this (filling a credential will repeat back the protocol/host fields to stdout, and the helper debug messages and askpass prompt will change on stderr). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:57 UTC
a88dbd2 t0300: make "quit" helper more realistic We test a toy credential helper that writes "quit=1" and confirms that we stop running other helpers. However, that helper is unrealistic in that it does not bother to read its stdin at all. For now we don't send any input to it, because we feed git-credential a blank credential. But that will change in the next patch, which will cause this test to racily fail, as git-credential will get SIGPIPE writing to the helper rather than exiting because it was asked to. Let's make this one-off helper more like our other sample helpers, and have it source the "dump" script. That will read stdin, fixing the SIGPIPE problem. But it will also write what it sees to stderr. We can make the test more robust by checking that output, which confirms that we do run the quit helper, don't run any other helpers, and exit for the reason we expected. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 19 April 2020, 23:10:52 UTC
21a3e50 Git 2.18.3 Signed-off-by: Junio C Hamano <gitster@pobox.com> 17 March 2020, 20:34:12 UTC
c42c0f1 Git 2.17.4 Signed-off-by: Junio C Hamano <gitster@pobox.com> 17 March 2020, 20:25:33 UTC
07259e7 fsck: detect gitmodules URLs with embedded newlines The credential protocol can't handle values with newlines. We already detect and block any such URLs from being used with credential helpers, but let's also add an fsck check to detect and block gitmodules files with such URLs. That will let us notice the problem earlier when transfer.fsckObjects is turned on. And in particular it will prevent bad objects from spreading, which may protect downstream users running older versions of Git. We'll file this under the existing gitmodulesUrl flag, which covers URLs with option injection. There's really no need to distinguish the exact flaw in the URL in this context. Likewise, I've expanded the description of t7416 to cover all types of bogus URLs. 12 March 2020, 06:56:50 UTC
c716fe4 credential: detect unrepresentable values when parsing urls The credential protocol can't represent newlines in values, but URLs can embed percent-encoded newlines in various components. A previous commit taught the low-level writing routines to die() when encountering this, but we can be a little friendlier to the user by detecting them earlier and handling them gracefully. This patch teaches credential_from_url() to notice such components, issue a warning, and blank the credential (which will generally result in prompting the user for a username and password). We blank the whole credential in this case. Another option would be to blank only the invalid component. However, we're probably better off not feeding a partially-parsed URL result to a credential helper. We don't know how a given helper would handle it, so we're better off to err on the side of matching nothing rather than something unexpected. The die() call in credential_write() is _probably_ impossible to reach after this patch. Values should end up in credential structs only by URL parsing (which is covered here), or by reading credential protocol input (which by definition cannot read a newline into a value). But we should definitely keep the low-level check, as it's our final and most accurate line of defense against protocol injection attacks. Arguably it could become a BUG(), but it probably doesn't matter much either way. Note that the public interface of credential_from_url() grows a little more than we need here. We'll use the extra flexibility in a future patch to help fsck catch these cases. 12 March 2020, 06:55:24 UTC
17f1c0b t/lib-credential: use test_i18ncmp to check stderr The credential tests have a "check" function which feeds some input to git-credential and checks the stdout and stderr. We look for exact matches in the output. For stdout, this makes sense; the output is the credential protocol. But for stderr, we may be showing various diagnostic messages, or the prompts fed to the askpass program, which could be translated. Let's mark them as such. 12 March 2020, 06:55:17 UTC
9a6bbee credential: avoid writing values with newlines The credential protocol that we use to speak to helpers can't represent values with newlines in them. This was an intentional design choice to keep the protocol simple, since none of the values we pass should generally have newlines. However, if we _do_ encounter a newline in a value, we blindly transmit it in credential_write(). Such values may break the protocol syntax, or worse, inject new valid lines into the protocol stream. The most likely way for a newline to end up in a credential struct is by decoding a URL with a percent-encoded newline. However, since the bug occurs at the moment we write the value to the protocol, we'll catch it there. That should leave no possibility of accidentally missing a code path that can trigger the problem. At this level of the code we have little choice but to die(). However, since we'd not ever expect to see this case outside of a malicious URL, that's an acceptable outcome. Reported-by: Felix Wilhelm <fwilhelm@google.com> 12 March 2020, 06:55:16 UTC
9877106 Git 2.18.2 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:29:17 UTC
14af7ed Sync with 2.17.3 * maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ... 06 December 2019, 15:29:15 UTC
a5ab8d0 Git 2.17.3 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:27:38 UTC
bb92255 fsck: reject submodule.update = !command in .gitmodules This allows hosting providers to detect whether they are being used to attack users using malicious 'update = !command' settings in .gitmodules. Since ac1fbbda2013 (submodule: do not copy unknown update mode from .gitmodules, 2013-12-02), in normal cases such settings have been treated as 'update = none', so forbidding them should not produce any collateral damage to legitimate uses. A quick search does not reveal any repositories making use of this construct, either. Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:27:38 UTC
bdfef04 Sync with 2.16.6 * maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ... 06 December 2019, 15:27:36 UTC
eb288bc Git 2.16.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:27:20 UTC
6844049 test-drop-caches: use `has_dos_drive_prefix()` This is a companion patch to 'mingw: handle `subst`-ed "DOS drives"': use the DOS drive prefix handling that is already provided by `compat/mingw.c` (and which just learned to handle non-alphabetical "drive letters"). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:27:20 UTC
9ac92fe Sync with 2.15.4 * maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ... 06 December 2019, 15:27:18 UTC
7cdafca Git 2.15.4 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:26:58 UTC
e904deb submodule: reject submodule.update = !command in .gitmodules Since ac1fbbda2013 (submodule: do not copy unknown update mode from .gitmodules, 2013-12-02), Git has been careful to avoid copying [submodule "foo"] update = !run an arbitrary scary command from .gitmodules to a repository's local config, copying in the setting 'update = none' instead. The gitmodules(5) manpage documents the intention: The !command form is intentionally ignored here for security reasons Unfortunately, starting with v2.20.0-rc0 (which integrated ee69b2a9 (submodule--helper: introduce new update-module-mode helper, 2018-08-13, first released in v2.20.0-rc0)), there are scenarios where we *don't* ignore it: if the config store contains no submodule.foo.update setting, the submodule-config API falls back to reading .gitmodules and the repository-supplied !command gets run after all. This was part of a general change over time in submodule support to read more directly from .gitmodules, since unlike .git/config it allows a project to change values between branches and over time (while still allowing .git/config to override things). But it was never intended to apply to this kind of dangerous configuration. The behavior change was not advertised in ee69b2a9's commit message and was missed in review. Let's take the opportunity to make the protection more robust, even in Git versions that are technically not affected: instead of quietly converting 'update = !command' to 'update = none', noisily treat it as an error. Allowing the setting but treating it as meaning something else was just confusing; users are better served by seeing the error sooner. Forbidding the construct makes the semantics simpler and means we can check for it in fsck (in a separate patch). As a result, the submodule-config API cannot read this value from .gitmodules under any circumstance, and we can declare with confidence For security reasons, the '!command' form is not accepted here. Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> 06 December 2019, 15:26:58 UTC
d3ac8c3 Sync with 2.14.6 * maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ... 06 December 2019, 15:26:55 UTC
66d2a61 Git 2.14.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 06 December 2019, 15:26:15 UTC
2ddcccf Merge branch 'win32-accommodate-funny-drive-names' While the only permitted drive letters for physical drives on Windows are letters of the US-English alphabet, this restriction does not apply to virtual drives assigned via `subst <letter>: <path>`. To prevent targeted attacks against systems where "funny" drive letters such as `1` or `!` are assigned, let's handle them as regular drive letters on Windows. This fixes CVE-2019-1351. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:09 UTC
65d30a1 Merge branch 'win32-filenames-cannot-have-trailing-spaces-or-periods' On Windows, filenames cannot have trailing spaces or periods, when opening such paths, they are stripped automatically. Read: you can open the file `README` via the file name `README . . .`. This ambiguity can be used in combination with other security bugs to cause e.g. remote code execution during recursive clones. This patch series fixes that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:09 UTC
5532ebd Merge branch 'fix-mingw-quoting-bug' This patch fixes a vulnerability in the Windows-specific code where a submodule names ending in a backslash were quoted incorrectly, and that bug could be abused to insert command-line parameters e.g. to `ssh` in a recursive clone. Note: this bug is Windows-only, as we have to construct a command line for the process-to-spawn, unlike Linux/macOS, where `execv()` accepts an already-split command line. While at it, other quoting issues are fixed as well. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:08 UTC
76a681c Merge branch 'dubiously-nested-submodules' Recursive clones are currently affected by a vulnerability that is caused by too-lax validation of submodule names. This topic branch fixes that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:08 UTC
dd53ea7 Merge branch 'turn-on-protectntfs-by-default' This patch series makes it safe to use Git on Windows drives, even if running on a mounted network share or within the Windows Subsystem for Linux (WSL). This topic branch addresses CVE-2019-1353. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:08 UTC
7f3551d Merge branch 'disallow-dotgit-via-ntfs-alternate-data-streams' This patch series plugs an attack vector we had overlooked in our December 2014 work on `core.protectNTFS`. Essentially, the path `.git::$INDEX_ALLOCATION/config` is interpreted as `.git/config` when NTFS Alternate Data Streams are available (which they are on Windows, and at least on network shares that are SMB-mounted on macOS). Needless to say: we don't want that. In fact, we want to stay on the very safe side and not even special-case the `$INDEX_ALLOCATION` stream type: let's just prevent Git from touching _any_ explicitly specified Alternate Data Stream of `.git`. In essence, we'll prevent Git from tracking, or writing to, any path with a segment of the form `.git:<anything>`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:07 UTC
f82a97e mingw: handle `subst`-ed "DOS drives" Over a decade ago, in 25fe217b86c (Windows: Treat Windows style path names., 2008-03-05), Git was taught to handle absolute Windows paths, i.e. paths that start with a drive letter and a colon. Unbeknownst to us, while drive letters of physical drives are limited to letters of the English alphabet, there is a way to assign virtual drive letters to arbitrary directories, via the `subst` command, which is _not_ limited to English letters. It is therefore possible to have absolute Windows paths of the form `1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode letters can also be used, e.g. `ä:\tschibät.sch`. While it can be sensibly argued that users who set up such funny drive letters really seek adverse consequences, the Windows Operating System is known to be a platform where many users are at the mercy of administrators who have their very own idea of what constitutes a reasonable setup. Therefore, let's just make sure that such funny paths are still considered absolute paths by Git, on Windows. In addition to Unicode characters, pretty much any character is a valid drive letter, as far as `subst` is concerned, even `:` and `"` or even a space character. While it is probably the opposite of smart to use them, let's safeguard `is_dos_drive_prefix()` against all of them. Note: `[::1]:repo` is a valid URL, but not a valid path on Windows. As `[` is now considered a valid drive letter, we need to be very careful to avoid misinterpreting such a string as valid local path in `url_is_local_not_ssh()`. To do that, we use the just-introduced function `is_valid_path()` (which will label the string as invalid file name because of the colon characters). This fixes CVE-2019-1351. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:07 UTC
379e51d quote-stress-test: offer to test quoting arguments for MSYS2 sh It is unfortunate that we need to quote arguments differently on Windows, depending whether we build a command-line for MSYS2's `sh` or for other Windows executables. We already have a test helper to verify the latter, with this patch we can also verify the former. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
7530a62 quote-stress-test: allow skipping some trials When the, say, 93rd trial run fails, it is a good idea to have a way to skip the first 92 trials and dig directly into the 93rd in a debugger. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
817ddd6 mingw: refuse to access paths with illegal characters Certain characters are not admissible in file names on Windows, even if Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they are, e.g. `:`, `<`, `>`, etc Let's disallow those characters explicitly in Windows builds of Git. Note: just like trailing spaces or periods, it _is_ possible on Windows to create commits adding files with such illegal characters, as long as the operation leaves the worktree untouched. To allow for that, we continue to guard `is_valid_win32_path()` behind the config setting `core.protectNTFS`, so that users _can_ continue to do that, as long as they turn the protections off via that config setting. Among other problems, this prevents Git from trying to write to an "NTFS Alternate Data Stream" (which refers to metadata stored alongside a file, under a special name: "<filename>:<stream-name>"). This fix therefore also prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. Further reading on illegal characters in Win32 filenames: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
d2c84da mingw: refuse to access paths with trailing spaces or periods When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
cc756ed unpack-trees: let merged_entry() pass through do_add_entry()'s errors A `git clone` will end with exit code 0 when `merged_entry()` returns a positive value during a call of `unpack_trees()` to `traverse_trees()`. The reason is that `unpack_trees()` will interpret a positive value not to be an error. The problem is, however, that `add_index_entry()` (which is called by `merged_entry()` can report an error, and we really should fail the entire clone in such a case. Let's fix this problem, in preparation for a Windows-specific patch disallowing `mkdir()` with directory names that contain a trailing space (which is illegal on NTFS): we want `git clone` to abort when a path cannot be checked out due to that condition. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
35edce2 t6130/t9350: prepare for stringent Win32 path validation On Windows, file names cannot contain asterisks nor newline characters. In an upcoming commit, we will make this limitation explicit, disallowing even the creation of commits that introduce such file names. However, in the test scripts touched by this patch, we _know_ that those paths won't be checked out, so we _want_ to allow such file names. Happily, the stringent path validation will be guarded via the `core.protectNTFS` flag, so all we need to do is to force that flag off temporarily. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:37:06 UTC
55953c7 quote-stress-test: accept arguments to test via the command-line When the stress test reported a problem with quoting certain arguments, it is helpful to have a facility to play with those arguments in order to find out whether variations of those arguments are affected, too. Let's allow `test-run-command quote-stress-test -- <args>` to be used for that purpose. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:53 UTC
ad15592 tests: add a helper to stress test argument quoting On Windows, we have to do all the command-line argument quoting ourselves. Worse: we have to have two versions of said quoting, one for MSYS2 programs (which have their own dequoting rules) and the rest. We care mostly about the rest, and to make sure that that works, let's have a stress test that comes up with all kinds of awkward arguments, verifying that a spawned sub-process receives those unharmed. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:52 UTC
6d86841 mingw: fix quoting of arguments We need to be careful to follow proper quoting rules. For example, if an argument contains spaces, we have to quote them. Double-quotes need to be escaped. Backslashes need to be escaped, but only if they are followed by a double-quote character. We need to be _extra_ careful to consider the case where an argument ends in a backslash _and_ needs to be quoted: in this case, we append a double-quote character, i.e. the backslash now has to be escaped! The current code, however, fails to recognize that, and therefore can turn an argument that ends in a single backslash into a quoted argument that now ends in an escaped double-quote character. This allows subsequent command-line parameters to be split and part of them being mistaken for command-line options, e.g. through a maliciously-crafted submodule URL during a recursive clone. Technically, we would not need to quote _all_ arguments which end in a backslash _unless_ the argument needs to be quoted anyway. For example, `test\` would not need to be quoted, while `test \` would need to be. To keep the code simple, however, and therefore easier to reason about and ensure its correctness, we now _always_ quote an argument that ends in a backslash. This addresses CVE-2019-1350. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:51 UTC
9102f95 protect_ntfs: turn on NTFS protection by default Back in the DOS days, in the FAT file system, file names always consisted of a base name of length 8 plus a file extension of length 3. Shorter file names were simply padded with spaces to the full 8.3 format. Later, the FAT file system was taught to support _also_ longer names, with an 8.3 "short name" as primary file name. While at it, the same facility allowed formerly illegal file names, such as `.git` (empty base names were not allowed), which would have the "short name" `git~1` associated with it. For backwards-compatibility, NTFS supports alternative 8.3 short filenames, too, even if starting with Windows Vista, they are only generated on the system drive by default. We addressed the problem that the `.git/` directory can _also_ be accessed via `git~1/` (when short names are enabled) in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e. since Git v1.9.5, by introducing the config setting `core.protectNTFS` and enabling it by default on Windows. In the meantime, Windows 10 introduced the "Windows Subsystem for Linux" (short: WSL), i.e. a way to run Linux applications/distributions in a thinly-isolated subsystem on Windows (giving rise to many a "2016 is the Year of Linux on the Desktop" jokes). WSL is getting increasingly popular, also due to the painless way Linux application can operate directly ("natively") on files on Windows' file system: the Windows drives are mounted automatically (e.g. `C:` as `/mnt/c/`). Taken together, this means that we now have to enable the safe-guards of Git v1.9.5 also in WSL: it is possible to access a `.git` directory inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation was disabled manually). Since regular Linux distributions run in WSL, this means we have to enable `core.protectNTFS` at least on Linux, too. To enable Services for Macintosh in Windows NT to store so-called resource forks, NTFS introduced "Alternate Data Streams". Essentially, these constitute additional metadata that are connected to (and copied with) their associated files, and they are accessed via pseudo file names of the form `filename:<stream-name>:<stream-type>`. In a recent patch, we extended `core.protectNTFS` to also protect against accesses via NTFS Alternate Data Streams, e.g. to prevent contents of the `.git/` directory to be "tracked" via yet another alternative file name. While it is not possible (at least by default) to access files via NTFS Alternate Data Streams from within WSL, the defaults on macOS when mounting network shares via SMB _do_ allow accessing files and directories in that way. Therefore, we need to enable `core.protectNTFS` on macOS by default, too, and really, on any Operating System that can mount network shares via SMB/CIFS. A couple of approaches were considered for fixing this: 1. We could perform a dynamic NTFS check similar to the `core.symlinks` check in `init`/`clone`: instead of trying to create a symbolic link in the `.git/` directory, we could create a test file and try to access `.git/config` via 8.3 name and/or Alternate Data Stream. 2. We could simply "flip the switch" on `core.protectNTFS`, to make it "on by default". The obvious downside of 1. is that it won't protect worktrees that were clone with a vulnerable Git version already. We considered patching code paths that check out files to check whether we're running on an NTFS system dynamically and persist the result in the repository-local config setting `core.protectNTFS`, but in the end decided that this solution would be too fragile, and too involved. The obvious downside of 2. is that everybody will have to "suffer" the performance penalty incurred from calling `is_ntfs_dotgit()` on every path, even in setups where. After the recent work to accelerate `is_ntfs_dotgit()` in most cases, it looks as if the time spent on validating ten million random file names increases only negligibly (less than 20ms, well within the standard deviation of ~50ms). Therefore the benefits outweigh the cost. Another downside of this is that paths that might have been acceptable previously now will be forbidden. Realistically, though, this is an improvement because public Git hosters already would reject any `git push` that contains such file names. Note: There might be a similar problem mounting HFS+ on Linux. However, this scenario has been considered unlikely and in light of the cost (in the aforementioned benchmark, `core.protectHFS = true` increased the time from ~440ms to ~610ms), it was decided _not_ to touch the default of `core.protectHFS`. This change addresses CVE-2019-1353. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Helped-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:51 UTC
3a85dc7 is_ntfs_dotgit(): speed it up Previously, this function was written without focusing on speed, intending to make reviewing the code as easy as possible, to avoid any bugs in this critical code. Turns out: we can do much better on both accounts. With this patch, we make it as fast as this developer can make it go: - We avoid the call to `is_dir_sep()` and make all the character comparisons explicit. - We avoid the cost of calling `strncasecmp()` and unroll the test for `.git` and `git~1`, not even using `tolower()` because it is faster to compare against two constant values. - We look for `.git` and `.git~1` first thing, and return early if not found. - We also avoid calling a separate function for detecting chains of spaces and periods. Each of these improvements has a noticeable impact on the speed of `is_ntfs_dotgit()`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:51 UTC
a8dee3c Disallow dubiously-nested submodule git directories Currently it is technically possible to let a submodule's git directory point right into the git dir of a sibling submodule. Example: the git directories of two submodules with the names `hippo` and `hippo/hooks` would be `.git/modules/hippo/` and `.git/modules/hippo/hooks/`, respectively, but the latter is already intended to house the former's hooks. In most cases, this is just confusing, but there is also a (quite contrived) attack vector where Git can be fooled into mistaking remote content for file contents it wrote itself during a recursive clone. Let's plug this bug. To do so, we introduce the new function `validate_submodule_git_dir()` which simply verifies that no git dir exists for any leading directories of the submodule name (if there are any). Note: this patch specifically continues to allow sibling modules names of the form `core/lib`, `core/doc`, etc, as long as `core` is not a submodule name. This fixes CVE-2019-1387. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 05 December 2019, 14:36:51 UTC
4778452 Merge branch 'prevent-name-squatting-on-windows' This patch series fixes an issue where Git could formerly have been tricked into creating a `.git` file with an unexpected (and therefore unprotected) NTFS short name. Incidentally, it also fixes an issue where a tree entry containing a backslash could be tricked into following a symbolic link, i.e. Git could be tricked into writing files outside the worktree. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 04 December 2019, 12:23:22 UTC
a7b1ad3 Merge branch 'jk/fast-import-unsafe' The `--export-marks` option of `git fast-import` is exposed also via the in-stream command `feature export-marks=...` and it allows overwriting arbitrary paths. This topic branch prevents the in-stream version, to prevent arbitrary file accesses by `git fast-import` streams coming from untrusted sources (e.g. in remote helpers that are based on `git fast-import`). This fixes CVE-2019-1348. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 04 December 2019, 12:23:22 UTC
e1d911d mingw: disallow backslash characters in tree objects' file names The backslash character is not a valid part of a file name on Windows. Hence it is dangerous to allow writing files that were unpacked from tree objects, when the stored file name contains a backslash character: it will be misinterpreted as directory separator. This not only causes ambiguity when a tree contains a blob `a\b` and a tree `a` that contains a blob `b`, but it also can be used as part of an attack vector to side-step the careful protections against writing into the `.git/` directory during a clone of a maliciously-crafted repository. Let's prevent that, addressing CVE-2019-1354. Note: we guard against backslash characters in tree objects' file names _only_ on Windows (because on other platforms, even on those where NTFS volumes can be mounted, the backslash character is _not_ a directory separator), and _only_ when `core.protectNTFS = true` (because users might need to generate tree objects for other platforms, of course without touching the worktree, e.g. using `git update-index --cacheinfo`). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 04 December 2019, 12:20:05 UTC
0060fd1 clone --recurse-submodules: prevent name squatting on Windows In addition to preventing `.git` from being tracked by Git, on Windows we also have to prevent `git~1` from being tracked, as the default NTFS short name (also known as the "8.3 filename") for the file name `.git` is `git~1`, otherwise it would be possible for malicious repositories to write directly into the `.git/` directory, e.g. a `post-checkout` hook that would then be executed _during_ a recursive clone. When we implemented appropriate protections in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), we had analyzed carefully that the `.git` directory or file would be guaranteed to be the first directory entry to be written. Otherwise it would be possible e.g. for a file named `..git` to be assigned the short name `git~1` and subsequently, the short name generated for `.git` would be `git~2`. Or `git~3`. Or even `~9999999` (for a detailed explanation of the lengths we have to go to protect `.gitmodules`, see the commit message of e7cb0b4455c (is_ntfs_dotgit: match other .git files, 2018-05-11)). However, by exploiting two issues (that will be addressed in a related patch series close by), it is currently possible to clone a submodule into a non-empty directory: - On Windows, file names cannot end in a space or a period (for historical reasons: the period separating the base name from the file extension was not actually written to disk, and the base name/file extension was space-padded to the full 8/3 characters, respectively). Helpfully, when creating a directory under the name, say, `sub.`, that trailing period is trimmed automatically and the actual name on disk is `sub`. This means that while Git thinks that the submodule names `sub` and `sub.` are different, they both access `.git/modules/sub/`. - While the backslash character is a valid file name character on Linux, it is not so on Windows. As Git tries to be cross-platform, it therefore allows backslash characters in the file names stored in tree objects. Which means that it is totally possible that a submodule `c` sits next to a file `c\..git`, and on Windows, during recursive clone a file called `..git` will be written into `c/`, of course _before_ the submodule is cloned. Note that the actual exploit is not quite as simple as having a submodule `c` next to a file `c\..git`, as we have to make sure that the directory `.git/modules/b` already exists when the submodule is checked out, otherwise a different code path is taken in `module_clone()` that does _not_ allow a non-empty submodule directory to exist already. Even if we will address both issues nearby (the next commit will disallow backslash characters in tree entries' file names on Windows, and another patch will disallow creating directories/files with trailing spaces or periods), it is a wise idea to defend in depth against this sort of attack vector: when submodules are cloned recursively, we now _require_ the directory to be empty, addressing CVE-2019-1349. Note: the code path we patch is shared with the code path of `git submodule update --init`, which must not expect, in general, that the directory is empty. Hence we have to introduce the new option `--force-init` and hand it all the way down from `git submodule` to the actual `git submodule--helper` process that performs the initial clone. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> 04 December 2019, 12:20:05 UTC
a52ed76 fast-import: disallow "feature import-marks" by default As with export-marks in the previous commit, import-marks can access the filesystem. This is significantly less dangerous than export-marks because it only involves reading from arbitrary paths, rather than writing them. However, it could still be surprising and have security implications (e.g., exfiltrating data from a service that accepts fast-import streams). Let's lump it (and its "if-exists" counterpart) in with export-marks, and enable the in-stream version only if --allow-unsafe-features is set. Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:04 UTC
68061e3 fast-import: disallow "feature export-marks" by default The fast-import stream command "feature export-marks=<path>" lets the stream write marks to an arbitrary path. This may be surprising if you are running fast-import against an untrusted input (which otherwise cannot do anything except update Git objects and refs). Let's disallow the use of this feature by default, and provide a command-line option to re-enable it (you can always just use the command-line --export-marks as well, but the in-stream version provides an easy way for exporters to control the process). This is a backwards-incompatible change, since the default is flipping to the new, safer behavior. However, since the main users of the in-stream versions would be import/export-based remote helpers, and since we trust remote helpers already (which are already running arbitrary code), we'll pass the new option by default when reading a remote helper's stream. This should minimize the impact. Note that the implementation isn't totally simple, as we have to work around the fact that fast-import doesn't parse its command-line options until after it has read any "feature" lines from the stream. This is how it lets command-line options override in-stream. But in our case, it's important to parse the new --allow-unsafe-features first. There are three options for resolving this: 1. Do a separate "early" pass over the options. This is easy for us to do because there are no command-line options that allow the "unstuck" form (so there's no chance of us mistaking an argument for an option), though it does introduce a risk of incorrect parsing later (e.g,. if we convert to parse-options). 2. Move the option parsing phase back to the start of the program, but teach the stream-reading code never to override an existing value. This is tricky, because stream "feature" lines override each other (meaning we'd have to start tracking the source for every option). 3. Accept that we might parse a "feature export-marks" line that is forbidden, as long we don't _act_ on it until after we've parsed the command line options. This would, in fact, work with the current code, but only because the previous patch fixed the export-marks parser to avoid touching the filesystem. So while it works, it does carry risk of somebody getting it wrong in the future in a rather subtle and unsafe way. I've gone with option (1) here as simple, safe, and unlikely to cause regressions. This fixes CVE-2019-1348. Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:04 UTC
0196830 fast-import: delay creating leading directories for export-marks When we parse the --export-marks option, we don't immediately open the file, but we do create any leading directories. This can be especially confusing when a command-line option overrides an in-stream one, in which case we'd create the leading directory for the in-stream file, even though we never actually write the file. Let's instead create the directories just before opening the file, which means we'll create only useful directories. Note that this could change the handling of relative paths if we chdir() in between, but we don't actually do so; the only permanent chdir is from setup_git_directory() which runs before either code path (potentially we should take the pre-setup dir into account to avoid surprising the user, but that's an orthogonal change). The test just adapts the existing "override" test to use paths with leading directories. This checks both that the correct directory is created (which worked before but was not tested), and that the overridden one is not (our new fix here). While we're here, let's also check the error result of safe_create_leading_directories(). We'd presumably notice any failure immediately after when we try to open the file itself, but we can give a more specific error message in this case. Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:04 UTC
e075dba fast-import: stop creating leading directories for import-marks When asked to import marks from "subdir/file.marks", we create the leading directory "subdir" if it doesn't exist. This makes no sense for importing marks, where we only ever open the path for reading. Most of the time this would be a noop, since if the marks file exists, then the leading directories exist, too. But if it doesn't (e.g., because --import-marks-if-exists was used), then we'd create the useless directory. This dates back to 580d5f83e7 (fast-import: always create marks_file directories, 2010-03-29). Even then it was useless, so it seems to have been added in error alongside the --export-marks case (which _is_ helpful). Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:04 UTC
11e934d fast-import: tighten parsing of boolean command line options We parse options like "--max-pack-size=" using skip_prefix(), which makes sense to get at the bytes after the "=". However, we also parse "--quiet" and "--stats" with skip_prefix(), which allows things like "--quiet-nonsense" to behave like "--quiet". This was a mistaken conversion in 0f6927c229 (fast-import: put option parsing code in separate functions, 2009-12-04). Let's tighten this to an exact match, which was the original intent. Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:04 UTC
816f806 t9300: create marks files for double-import-marks test Our tests confirm that providing two "import-marks" options in a fast-import stream is an error. However, the invoked command would fail even without covering this case, because the marks files themselves do not actually exist. Let's create the files to make sure we fail for the right reason (we actually do, because the option parsing happens before we open anything, but this future-proofs our test). Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:03 UTC
f94804c t9300: drop some useless uses of cat These waste a process, and make the line longer than it needs to be. Signed-off-by: Jeff King <peff@peff.net> 04 December 2019, 12:20:03 UTC
6e9e91e Git 2.17.2 Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:44:07 UTC
1a7fd1f fsck: detect submodule paths starting with dash As with urls, submodule paths with dashes are ignored by git, but may end up confusing older versions. Detecting them via fsck lets us prevent modern versions of git from being a vector to spread broken .gitmodules to older versions. Compared to blocking leading-dash urls, though, this detection may be less of a good idea: 1. While such paths provide confusing and broken results, they don't seem to actually work as option injections against anything except "cd". In particular, the submodule code seems to canonicalize to an absolute path before running "git clone" (so it passes /your/clone/-sub). 2. It's more likely that we may one day make such names actually work correctly. Even after we revert this fsck check, it will continue to be a hassle until hosting servers are all updated. On the other hand, it's not entirely clear that the behavior in older versions is safe. And if we do want to eventually allow this, we may end up doing so with a special syntax anyway (e.g., writing "./-sub" in the .gitmodules file, and teaching the submodule code to canonicalize it when comparing). So on balance, this is probably a good protection. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:41:31 UTC
a124133 fsck: detect submodule urls starting with dash Urls with leading dashes can cause mischief on older versions of Git. We should detect them so that they can be rejected by receive.fsckObjects, preventing modern versions of git from being a vector by which attacks can spread. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:41:26 UTC
e43aab7 Sync with 2.16.5 * maint-2.16: Git 2.16.5 Git 2.15.3 Git 2.14.5 submodule-config: ban submodule paths that start with a dash submodule-config: ban submodule urls that start with dash submodule--helper: use "--" to signal end of clone options 27 September 2018, 18:41:02 UTC
27d05d1 Git 2.16.5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:38:32 UTC
424aac6 Sync with 2.15.3 * maint-2.15: Git 2.15.3 Git 2.14.5 submodule-config: ban submodule paths that start with a dash submodule-config: ban submodule urls that start with dash submodule--helper: use "--" to signal end of clone options 27 September 2018, 18:35:43 UTC
924c623 Git 2.15.3 Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:33:47 UTC
902df9f Sync with Git 2.14.4 * maint-2.14: Git 2.14.5 submodule-config: ban submodule paths that start with a dash submodule-config: ban submodule urls that start with dash submodule--helper: use "--" to signal end of clone options 27 September 2018, 18:20:22 UTC
d0832b2 Git 2.14.5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 18:19:11 UTC
273c614 submodule-config: ban submodule paths that start with a dash We recently banned submodule urls that look like command-line options. This is the matching change to ban leading-dash paths. As with the urls, this should not break any use cases that currently work. Even with our "--" separator passed to git-clone, git-submodule.sh gets confused. Without the code portion of this patch, the clone of "-sub" added in t7417 would yield results like: /path/to/git-submodule: 410: cd: Illegal option -s /path/to/git-submodule: 417: cd: Illegal option -s /path/to/git-submodule: 410: cd: Illegal option -s /path/to/git-submodule: 417: cd: Illegal option -s Fetched in submodule path '-sub', but it did not contain b56243f8f4eb91b2f1f8109452e659f14dd3fbe4. Direct fetching of that commit failed. Moreover, naively adding such a submodule doesn't work: $ git submodule add $url -sub The following path is ignored by one of your .gitignore files: -sub even though there is no such ignore pattern (the test script hacks around this with a well-placed "git mv"). Unlike leading-dash urls, though, it's possible that such a path _could_ be useful if we eventually made it work. So this commit should be seen not as recommending a particular policy, but rather temporarily closing off a broken and possibly dangerous code-path. We may revisit this decision later. There are two minor differences to the tests in t7416 (that covered urls): 1. We don't have a "./-sub" escape hatch to make this work, since the submodule code expects to be able to match canonical index names to the path field (so you are free to add submodule config with that path, but we would never actually use it, since an index entry would never start with "./"). 2. After this patch, cloning actually succeeds. Since we ignore the submodule.*.path value, we fail to find a config stanza for our submodule at all, and simply treat it as inactive. We still check for the "ignoring" message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 16:34:59 UTC
f6adec4 submodule-config: ban submodule urls that start with dash The previous commit taught the submodule code to invoke our "git clone $url $path" with a "--" separator so that we aren't confused by urls or paths that start with dashes. However, that's just one code path. It's not clear if there are others, and it would be an easy mistake to add one in the future. Moreover, even with the fix in the previous commit, it's quite hard to actually do anything useful with such an entry. Any url starting with a dash must fall into one of three categories: - it's meant as a file url, like "-path". But then any clone is not going to have the matching path, since it's by definition relative inside the newly created clone. If you spell it as "./-path", the submodule code sees the "/" and translates this to an absolute path, so it at least works (assuming the receiver has the same filesystem layout as you). But that trick does not apply for a bare "-path". - it's meant as an ssh url, like "-host:path". But this already doesn't work, as we explicitly disallow ssh hostnames that begin with a dash (to avoid option injection against ssh). - it's a remote-helper scheme, like "-scheme::data". This _could_ work if the receiver bends over backwards and creates a funny-named helper like "git-remote--scheme". But normally there would not be any helper that matches. Since such a url does not work today and is not likely to do anything useful in the future, let's simply disallow them entirely. That protects the existing "git clone" path (in a belt-and-suspenders way), along with any others that might exist. Our tests cover two cases: 1. A file url with "./" continues to work, showing that there's an escape hatch for people with truly silly repo names. 2. A url starting with "-" is rejected. Note that we expect case (2) to fail, but it would have done so even without this commit, for the reasons given above. So instead of just expecting failure, let's also check for the magic word "ignoring" on stderr. That lets us know that we failed for the right reason. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 16:34:58 UTC
98afac7 submodule--helper: use "--" to signal end of clone options When we clone a submodule, we call "git clone $url $path". But there's nothing to say that those components can't begin with a dash themselves, confusing git-clone into thinking they're options. Let's pass "--" to make it clear what we expect. There's no test here, because it's actually quite hard to make these names work, even with "git clone" parsing them correctly. And we're going to restrict these cases even further in future commits. So we'll leave off testing until then; this is just the minimal fix to prevent us from doing something stupid with a badly formed entry. Reported-by: joernchen <joernchen@phenoelit.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 27 September 2018, 16:34:55 UTC
fc54c1a Git 2.17.1 Signed-off-by: Junio C Hamano <gitster@pobox.com> 22 May 2018, 05:28:26 UTC
9e84a6d Merge branch 'jk/submodule-fsck-loose' into maint * jk/submodule-fsck-loose: fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific 22 May 2018, 05:26:05 UTC
68f95b2 Sync with Git 2.16.4 * maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths 22 May 2018, 05:25:26 UTC
a42a58d Git 2.16.4 Signed-off-by: Junio C Hamano <gitster@pobox.com> 22 May 2018, 05:18:51 UTC
0230204 Sync with Git 2.15.2 * maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths 22 May 2018, 05:18:06 UTC
d33c875 Git 2.15.2 Signed-off-by: Junio C Hamano <gitster@pobox.com> 22 May 2018, 05:15:59 UTC
9e0f06d Sync with Git 2.14.4 * maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths 22 May 2018, 05:15:14 UTC
4dde7b8 Git 2.14.4 Signed-off-by: Junio C Hamano <gitster@pobox.com> 22 May 2018, 05:12:02 UTC
7b01c71 Sync with Git 2.13.7 * maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths 22 May 2018, 05:10:49 UTC
0114f71 Git 2.13.7 Signed-off-by: Junio C Hamano <gitster@pobox.com> 22 May 2018, 04:50:36 UTC
8528c31 Merge branch 'jk/submodule-fix-loose' into maint-2.13 * jk/submodule-fix-loose: verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths 22 May 2018, 04:48:26 UTC
b7b1fca fsck: complain when .gitmodules is a symlink We've recently forbidden .gitmodules to be a symlink in verify_path(). And it's an easy way to circumvent our fsck checks for .gitmodules content. So let's complain when we see it. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
73c3f0f index-pack: check .gitmodules files with --strict Now that the internal fsck code has all of the plumbing we need, we can start checking incoming .gitmodules files. Naively, it seems like we would just need to add a call to fsck_finish() after we've processed all of the objects. And that would be enough to cover the initial test included here. But there are two extra bits: 1. We currently don't bother calling fsck_object() at all for blobs, since it has traditionally been a noop. We'd actually catch these blobs in fsck_finish() at the end, but it's more efficient to check them when we already have the object loaded in memory. 2. The second pass done by fsck_finish() needs to access the objects, but we're actually indexing the pack in this process. In theory we could give the fsck code a special callback for accessing the in-pack data, but it's actually quite tricky: a. We don't have an internal efficient index mapping oids to packfile offsets. We only generate it on the fly as part of writing out the .idx file. b. We'd still have to reconstruct deltas, which means we'd basically have to replicate all of the reading logic in packfile.c. Instead, let's avoid running fsck_finish() until after we've written out the .idx file, and then just add it to our internal packed_git list. This does mean that the objects are "in the repository" before we finish our fsck checks. But unpack-objects already exhibits this same behavior, and it's an acceptable tradeoff here for the same reason: the quarantine mechanism means that pushes will be fully protected. In addition to a basic push test in t7415, we add a sneaky pack that reverses the usual object order in the pack, requiring that index-pack access the tree and blob during the "finish" step. This already works for unpack-objects (since it will have written out loose objects), but we'll check it with this sneaky pack for good measure. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
6e328d6 unpack-objects: call fsck_finish() after fscking objects As with the previous commit, we must call fsck's "finish" function in order to catch any queued objects for .gitmodules checks. This second pass will be able to access any incoming objects, because we will have exploded them to loose objects by now. This isn't quite ideal, because it means that bad objects may have been written to the object database (and a subsequent operation could then reference them, even if the other side doesn't send the objects again). However, this is sufficient when used with receive.fsckObjects, since those loose objects will all be placed in a temporary quarantine area that will get wiped if we find any problems. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
1995b5e fsck: call fsck_finish() after fscking objects Now that the internal fsck code is capable of checking .gitmodules files, we just need to teach its callers to use the "finish" function to check any queued objects. With this, we can now catch the malicious case in t7415 with git-fsck. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
ed8b10f fsck: check .gitmodules content This patch detects and blocks submodule names which do not match the policy set forth in submodule-config. These should already be caught by the submodule code itself, but putting the check here means that newer versions of Git can protect older ones from malicious entries (e.g., a server with receive.fsckObjects will block the objects, protecting clients which fetch from it). As a side effect, this means fsck will also complain about .gitmodules files that cannot be parsed (or were larger than core.bigFileThreshold). Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
2738744 fsck: handle promisor objects in .gitmodules check If we have a tree that points to a .gitmodules blob but don't have that blob, we can't check its contents. This produces an fsck error when we encounter it. But in the case of a promisor object, this absence is expected, and we must not complain. Note that this can technically circumvent our transfer.fsckObjects check. Imagine a client fetches a tree, but not the matching .gitmodules blob. An fsck of the incoming objects will show that we don't have enough information. Later, we do fetch the actual blob. But we have no idea that it's a .gitmodules file. The only ways to get around this would be to re-scan all of the existing trees whenever new ones enter (which is expensive), or to somehow persist the gitmodules_found set between fsck runs (which is complicated). In practice, it's probably OK to ignore the problem. Any repository which has all of the objects (including the one serving the promisor packs) can perform the checks. Since promisor packs are inherently about a hierarchical topology in which clients rely on upstream repositories, those upstream repositories can protect all of their downstream clients from broken objects. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
159e7b0 fsck: detect gitmodules files In preparation for performing fsck checks on .gitmodules files, this commit plumbs in the actual detection of the files. Note that unlike most other fsck checks, this cannot be a property of a single object: we must know that the object is found at a ".gitmodules" path at the root tree of a commit. Since the fsck code only sees one object at a time, we have to mark the related objects to fit the puzzle together. When we see a commit we mark its tree as a root tree, and when we see a root tree with a .gitmodules file, we mark the corresponding blob to be checked. In an ideal world, we'd check the objects in topological order: commits followed by trees followed by blobs. In that case we can avoid ever loading an object twice, since all markings would be complete by the time we get to the marked objects. And indeed, if we are checking a single packfile, this is the order in which Git will generally write the objects. But we can't count on that: 1. git-fsck may show us the objects in arbitrary order (loose objects are fed in sha1 order, but we may also have multiple packs, and we process each pack fully in sequence). 2. The type ordering is just what git-pack-objects happens to write now. The pack format does not require a specific order, and it's possible that future versions of Git (or a custom version trying to fool official Git's fsck checks!) may order it differently. 3. We may not even be fscking all of the relevant objects at once. Consider pushing with transfer.fsckObjects, where one push adds a blob at path "foo", and then a second push adds the same blob at path ".gitmodules". The blob is not part of the second push at all, but we need to mark and check it. So in the general case, we need to make up to three passes over the objects: once to make sure we've seen all commits, then once to cover any trees we might have missed, and then a final pass to cover any .gitmodules blobs we found in the second pass. We can simplify things a bit by loosening the requirement that we find .gitmodules only at root trees. Technically a file like "subdir/.gitmodules" is not parsed by Git, but it's not unreasonable for us to declare that Git is aware of all ".gitmodules" files and make them eligible for checking. That lets us drop the root-tree requirement, which eliminates one pass entirely. And it makes our worst case much better: instead of potentially queueing every root tree to be re-examined, the worst case is that we queue each unique .gitmodules blob for a second look. This patch just adds the boilerplate to find .gitmodules files. The actual content checks will come in a subsequent commit. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
7ac4f3a fsck: actually fsck blob data Because fscking a blob has always been a noop, we didn't bother passing around the blob data. In preparation for content-level checks, let's fix up a few things: 1. The fsck_object() function just returns success for any blob. Let's a noop fsck_blob(), which we can fill in with actual logic later. 2. The fsck_loose() function in builtin/fsck.c just threw away blob content after loading it. Let's hold onto it until after we've called fsck_object(). The easiest way to do this is to just drop the parse_loose_object() helper entirely. Incidentally, this also fixes a memory leak: if we successfully loaded the object data but did not parse it, we would have left the function without freeing it. 3. When fsck_loose() loads the object data, it does so with a custom read_loose_object() helper. This function streams any blobs, regardless of size, under the assumption that we're only checking the sha1. Instead, let's actually load blobs smaller than big_file_threshold, as the normal object-reading code-paths would do. This lets us fsck small files, and a NULL return is an indication that the blob was so big that it needed to be streamed, and we can pass that information along to fsck_blob(). Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
ed9c322 fsck: simplify ".git" check There's no need for us to manually check for ".git"; it's a subset of the other filesystem-specific tests. Dropping it makes our code slightly shorter. More importantly, the existing code may make a reader wonder why ".GIT" is not covered here, and whether that is a bug (it isn't, as it's also covered in the filesystem-specific tests). Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
db5a58c index-pack: make fsck error message more specific If fsck reports an error, we say only "Error in object". This isn't quite as bad as it might seem, since the fsck code would have dumped some errors to stderr already. But it might help to give a little more context. The earlier output would not have even mentioned "fsck", and that may be a clue that the "fsck.*" or "*.fsckObjects" config may be relevant. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:55:12 UTC
eedd594 Merge branch 'jk/submodule-name-verify-fix' into jk/submodule-name-verify-fsck * jk/submodule-name-verify-fix: verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_path: drop clever fallthrough skip_prefix: add icase-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests path: match NTFS short names for more .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths Note that this includes two bits of evil-merge: - there's a new call to verify_path() that doesn't actually have a mode available. It should be OK to pass "0" here, since we're just manipulating the untracked cache, not an actual index entry. - the lstat() in builtin/update-index.c:update_one() needs to be updated to handle the fsmonitor case (without this it still behaves correctly, but does an unnecessary lstat). 22 May 2018, 03:54:28 UTC
10ecfa7 verify_path: disallow symlinks in .gitmodules There are a few reasons it's not a good idea to make .gitmodules a symlink, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git may look at this file in the index or a tree without bothering to resolve any symbolic links. We don't do this _yet_, but the config infrastructure is there and it's planned for the future. With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow it: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the complications in (1) and (2), we're better off just outlawing it explicitly. Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net> 22 May 2018, 03:50:11 UTC
back to top