A significant amount of source code has already been ingested in the Software Heritage archive, see the Archive Changelog for notable changes to the archive over time.
It currently includes the following software origins.

Regular crawling

These software origins get continuously discovered and archived using the listers implemented by Software Heritage.

instance type count search
bitbucket git 593,906
instance type count search
bower git 1,198
instance type count search
eclipse git 34
git-kernel git 1,042
instance type count search
cran cran 25,875
instance type count search
dlang git 11
instance type count search
Debian deb 35,037
Ubuntu-Security deb 2,701
instance type count search
try-gitea git 4
instance type count search
github git 1,731,595
instance type count search
inria git 2,502
instance type count search
gogs.univ-littoral.fr git 12
instance type count search
forge.extranet.logilab.fr hg 353
heptapod.host hg 383
instance type count search
clojars.org hg 1
repo1.maven.org git 219
repo1.maven.org hg 21
repo1.maven.org maven 284,454
repository.jboss.org maven 1,745
instance type count search
coq.inria.fr opam 432
github.com opam 39
opam.ocaml.org opam 4,434
instance type count search
packagist git 58,182
packagist hg 11
packagist svn 50
instance type count search
swh git 160
instance type count search
pubdev pubdev 45,717
instance type count search
CentOS rpm 5,035
Fedora rpm 16,257
instance type count search
main bzr 8
main git 12,232
main hg 25,059
Discontinued hosting

Discontinued hosting services. Those origins have been archived by Software Heritage.

instance type search
gitorious git
instance type search
googlecode git
googlecode hg
googlecode svn
instance type search
bitbucket hg
On demand archival

These origins are directly pushed into the archive by trusted partners using the deposit service of Software Heritage.

instance type search
elife deposit
instance type search
hal deposit
instance type search
ipol deposit
JavaScript license information