Configured a .devcontainer, Dockerfile, settings.json, launch.json, and github actions for a personal project.
Pasted in tree of directory structure to explore high-level organization options.
Converted 20 pages of handwritten research notes into a summarized README.md.
Asked for libraries which might be helpful, and compared tradeoffs.
Created tikZ/pgf illustrations, Cmd+K to add details, then Cmd+k again to simplify.
Fed in 80 paragraphs of notes, asked for suggested tests to take them into consideration, wrote them out. (Model not very good at pytest fixtures?)
Some test failures could not be auto-fixed, paste in stack trace, draft request for help to collaborator.
Lift and shift is much easier, so started a migration from torch to pytorch-lightning.
Replaced sections of manually written code with calls to library functions.
Asked for quality issues to address, then worked through the checklist it gave me.
Deduplicated redundant code in scripts, organizing logic into modules.
Went outside of my codebase and used terminal command suggestions to categorize + clean up files in my downloads folder, freeing 1/3rd of used disk space.
Queried to check my current progress vs what was still remaining in the plan.
Checked for inconsistencies when making changes across multiple files.
Made changes in FileA, then Cmd+K in FileB “Change to be compatible with @FileA”
Asked if FileC would need modifications or not, and if so, what it would involve.
Pasted in sample external code and Cmd+k to integrate it with my project.
At work, took a stale PR, set up new branch, Cmd+k old comments, checked it in.
Asked a principal engineer for a quick call, shared screen, recorded convo for 4.5 min, pasted transcript to chat, list changes discussed, implement them.
Wanted to learn memory management, had it generate puzzles for me to solve, told it to calibrate difficulty of next round of problems based on score my answers get.
A lot of prompt engineering to convince model to ask decent questions, found napkin math helps more than adding quotes from high quality textbooks/blog posts.
Being given all the steps at once is overwhelming. Consider option for inner monologue, only showing next step, but can expand entire plan at any time.
Consider borrowing elements from software designed for people with mental disabilities, e.g https://www.mycoughdrop.com/. This reduces cognitive burden but keeps a human in the loop instead of delegating the task to an agent.
One example might be, when chat interface creates a bullet point, proactively generate code, add an icon, and put it as an action on a board.
Software in the real world often grows organically instead of following a strict design. The lens I take of modern source control, dependency analysis, trace debugging, etc. is a form of genetic engineering where you splice in new traits, generate mutants, select against a fitness function. Consider a “Smart Paste” feature for merging, or a “Disentangle” feature to do the opposite, this could be implemented with a concept slider to select the relative strength of each parent.
I would like to see more emphasis on understanding and interpretability. Anysphere is set to become commercially successful, shortens AGI timelines by a bit compared to the counterfactual, but not by much given Copilot X… I want to understand Sualeh and Aman’s view on integrating alignment research into their products and mitigating x-risks.