Simon Willison ¶
-
https://simonwillison.net/2020/Feb/11/cheating-at-unit-tests-pytest-black/
-
https://simonwillison.net/2018/Jul/28/documentation-unit-tests/
-
https://datasette.readthedocs.io/en/latest/ecosystem.html#ecosystem
Simon Willison github README ¶
Building a self-updating profile README for GitHub ¶
GitHub quietly released a new feature at some point in the past few days profile READMEs.
Create a repository with the same name as your GitHub account (in my case that’s github.com/simonw/simonw), add a README.md to it and GitHub will render the contents at the top of your personal profile page—for me that’s github.com/simonw
I couldn’t resist re-using the trick from this blog post and implementing a GitHub Action to automatically keep my profile README up-to-date.
Visit github.com/simonw and you’ll see a three-column README showing my latest GitHub project releases, my latest blog entries and my latest TILs.
I’m doing this with a GitHub Action in build.yml. It’s configured to run on every push to the repo, on a schedule at 32 minutes past the hour and on the new workflow_dispatch event which means I get a manual button I can click to trigger it on demand.
The Action runs a Python script called build_readme.py which does the following:
-
Hits the GitHub GraphQL API to retrieve the latest release for every one of my 300+ repositories
-
Hits my blog’s full entries Atom feed to retrieve the most recent posts (using the feedparser Python library)
-
Hits my TILs website’s Datasette API running this SQL query to return the latest TIL links
It then turns the results from those various sources into a markdown list of links and replaces commented blocks in the README that look like this:
<!-- recent_releases starts -->
...
<!-- recent_releases ends -->
The whole script is less than 150 lines of Python .
simonw/blob/master/build_readme.py ¶
from python_graphql_client import GraphqlClient
import feedparser
import httpx
import json
import pathlib
import re
import os
root = pathlib.Path(__file__).parent.resolve()
client = GraphqlClient(endpoint="https://api.github.com/graphql")
TOKEN = os.environ.get("SIMONW_TOKEN", "")
def replace_chunk(content, marker, chunk):
r = re.compile(
r"<!\-\- {} starts \-\->.*<!\-\- {} ends \-\->".format(marker, marker),
re.DOTALL,
)
chunk = "<!-- {} starts -->\n{}\n<!-- {} ends -->".format(marker, chunk, marker)
return r.sub(chunk, content)
def make_query(after_cursor=None):
return """
query {
viewer {
repositories(first: 100, privacy: PUBLIC, after:AFTER) {
pageInfo {
hasNextPage
endCursor
}
nodes {
name
releases(last:1) {
totalCount
nodes {
name
publishedAt
url
}
}
}
}
}
}
""".replace(
"AFTER", '"{}"'.format(after_cursor) if after_cursor else "null"
)
def fetch_releases(oauth_token):
repos = []
releases = []
repo_names = set()
has_next_page = True
after_cursor = None
while has_next_page:
data = client.execute(
query=make_query(after_cursor),
headers={"Authorization": "Bearer {}".format(oauth_token)},
)
print()
print(json.dumps(data, indent=4))
print()
for repo in data["data"]["viewer"]["repositories"]["nodes"]:
if repo["releases"]["totalCount"] and repo["name"] not in repo_names:
repos.append(repo)
repo_names.add(repo["name"])
releases.append(
{
"repo": repo["name"],
"release": repo["releases"]["nodes"][0]["name"]
.replace(repo["name"], "")
.strip(),
"published_at": repo["releases"]["nodes"][0][
"publishedAt"
].split("T")[0],
"url": repo["releases"]["nodes"][0]["url"],
}
)
has_next_page = data["data"]["viewer"]["repositories"]["pageInfo"][
"hasNextPage"
]
after_cursor = data["data"]["viewer"]["repositories"]["pageInfo"]["endCursor"]
return releases
def fetch_tils():
sql = "select title, url, created_utc from til order by created_utc desc limit 5"
return httpx.get(
"https://til.simonwillison.net/til.json",
params={"sql": sql, "_shape": "array",},
).json()
def fetch_blog_entries():
entries = feedparser.parse("https://simonwillison.net/atom/entries/")["entries"]
return [
{
"title": entry["title"],
"url": entry["link"].split("#")[0],
"published": entry["published"].split("T")[0],
}
for entry in entries
]
if __name__ == "__main__":
readme = root / "README.md"
releases = fetch_releases(TOKEN)
releases.sort(key=lambda r: r["published_at"], reverse=True)
md = "\n".join(
[
"* [{repo} {release}]({url}) - {published_at}".format(**release)
for release in releases[:5]
]
)
readme_contents = readme.open().read()
rewritten = replace_chunk(readme_contents, "recent_releases", md)
tils = fetch_tils()
tils_md = "\n".join(
[
"* [{title}]({url}) - {created_at}".format(
title=til["title"],
url=til["url"],
created_at=til["created_utc"].split("T")[0],
)
for til in tils
]
)
rewritten = replace_chunk(rewritten, "tils", tils_md)
entries = fetch_blog_entries()[:5]
entries_md = "\n".join(
["* [{title}]({url}) - {published}".format(**entry) for entry in entries]
)
rewritten = replace_chunk(rewritten, "blog", entries_md)
readme.open("w").write(rewritten)
Things I’ve learned (TIL) ¶
See also