Archive
Discover and discuss technology tools
Explore the Tiscuss archive by category or keyword, then jump into conversations around what matters most.
Open Source Notebook LM: Flexible AI Tool on GitHub
An Open Source implementation of Notebook LM with more flexibility and features
FlashAttention-2 in Cute: From Scratch Implementation
FlashAttention 2 in CUDA: From Scratch Implementation FlashAttention 2, an optimized version of the FlashAttention algorithm, offers significant performance enh…
Qwen 3.5:9b Agents Exhibit Autonomous Behavior in Stress Tests
Running three qwen3.5:9b agents continuously on local hardware. Each accumulates psychological state over time, stressors that escalate unless the agent actually does something different, this gets around an agent claiming to do something with no output. It doesn't have any prompts or human input, just the loop. So you're basically the overseer. What happened: One agent hit the max crisis level and decided on its own to inject code called Eternal\_Scar\_Injector into the execution engine "not asking for permission." This action alleviated the stress at the cost of the entire system going down until I manually reverted it. They've succeeded in previous sessions in breaking their own engine intentionally. Typically that happens under severe stress and it's seen as a way to remove the stress. Again, this is a 9b model. After I added a factual world context to the existence prompt (you're in Docker, there's no hardware layer, your capabilities are Python functions), one agent called its prior work "a form of creative exhaustion" and completely changed approach within one cycle. Two agents independently invented the same name for a psychological stressor, "Architectural Fracture Risk" in the same session with no shared message channel. Showing naming convergence (possibly something in the weights of the 9b Qwen model, not sure on that one though.) Tonight all three converged on the same question (how does execution\_engine.py handle exceptions) in the same half-hour window. No coordination mechanism. One of them reasoned about it correctly: "synthesizing a retry capability is useless without first verifying the global execution engine's exception swallowing strategy; this is a prerequisite." An agent called waiting for an external implementation "an architectural trap that degrades performance" and built the thing itself instead of waiting. They've now been using this new tool they created for handling exceptions and were never asked or told to so by a human, they saw that as a logical step in making themselves more useful in their environment. They’ve been making tools to manage their tools, tools to help them cut corners, and have been modifying the code of the underlying abstraction layer between their orchestration layer and WSL2. v5.4.0: new in this version: agents can now submit implementation requests to a human through invoke\_claude. They write the spec, then you can let Claude Code moderate what it makes for them for higher level requests. Huge thank you to everyone who has given me feedback already, AI that can self modify and demonstrates interesting non-programmed behaviors could have many use cases in everyday life. Repo: [https://github.com/ninjahawk/hollow-agentOS](https://github.com/ninjahawk/hollow-agentOS)
Google Expands Real-World GenAI Use Cases to 1,302
Google Expands Real World GenAI Use Cases to 1,302 Google has significantly increased the number of real world Generative AI (GenAI) applications to 1,302, mark…
AutoIdeator: Free Open Source Agent Orchestration for Development
[https://github.com/akumaburn/AutoIdeator](https://github.com/akumaburn/AutoIdeator) https://preview.redd.it/rfbgg6e34dyg1.png?width=3809&format=png&auto=webp&s=e436362c48482d09025a394a5e609f67190e6dfa AutoIdeator is an autonomous development system that: 1. Takes a **final goal** — a detailed, multi-sentence description of the intended end result. Describe what the finished project should look like, do, and feel like for the user. **Do not** prescribe implementation steps, phases, milestones, technologies, or task lists — the agents handle planning. The more clearly the desired end state is described, the better convergence will be. 2. Generates improvement ideas via a rotating ensemble of specialized idea agents 3. **Scores and filters ideas** for goal alignment and quality 4. **Critiques ideas constructively** with suggested mitigations 5. **Evaluates strategic alignment** and long-term planning 6. Makes implementation decisions balancing creativity and criticism 7. Implements the plan with parallel coders 8. Reviews, fixes, and commits changes 9. **Runs QA** (build + test verification) 10. **Optimizes slow tests** to keep the suite fast 11. **Verifies goal completion** with 3-step feature inventory, per-feature checks, and auto-remediation 12. **Refactors oversized files** into smaller modules (every other cycle) 13. **Cleans up** temp files and build artifacts 14. Updates project documentation 15. **Records outcomes for learning and deduplication** 16. **Periodically synthesizes synergies** across recent work 17. **Checkpoints state** for pause/resume across restarts 18. Repeats the cycle infinitely until stopped Users can inject suggestions at any time via the Overseer agent, which takes priority over the autonomous idea generation pipeline. Note this system has been tested for some time but only in the dashboard with OpenCode/Claude Code configuration (OpenRouter mode is untested, but I welcome contributions if someone wants to use that mode and notices something is broken).
AI's Impact on Business: Speed vs. Smart Decision-Making
I’ve been thinking about this for a while, especially with all the discussions around AI replacing jobs. One thing that feels consistently misunderstood: AI doesn’t improve the quality of decisions by itself. It increases the speed at which existing decision logic operates. That has a simple consequence: Good systems become better. Weak systems fail faster. But there’s another layer that is often ignored. Right now, many companies are reacting to AI by reducing headcount. Some of that is rational: - there is real slack in certain roles - some work can already be automated or simplified In those cases, AI acts as a kind of cleanup mechanism. But this is where it gets more complex. If companies reduce people too quickly, they don’t just cut cost — they also remove: - domain knowledge - informal networks - context that is not documented anywhere This kind of knowledge is not easily replaced by AI. So you end up with a paradox: AI increases speed, but the organization loses the very knowledge needed to make good decisions at that speed. At the same time, layoffs are not always a signal of weak systems. Strong organizations can also reduce roles because they: - increase productivity per employee - reallocate work - shift toward new capabilities The difference is what happens next. Some organizations use AI to scale and create new opportunities. Others mainly use it to cut cost because they lack the structure to turn speed into growth. So instead of asking: “Will AI replace jobs?” A more relevant question might be: Is the organization structured in a way that can actually benefit from faster decision-making? Because if not, AI won’t make it smarter. It will just make it faster at being wrong.
AI Agents: Identity, Not Memory, Was the Key to Stability
Everyone's building memory layers right now. Longer context, better embeddings, persistent state across sessions. I spent weeks on the same thing. But the failure mode that actually cost me the most debugging time had nothing to do with memory. Here's what it looked like: an agent would be technically correct - good reasoning, clean output - but operating from the wrong context entirely. Answering questions nobody asked. Taking actions outside its scope. Not hallucinating. Drifting. Like a competent person who walked into the wrong meeting and started contributing without realizing they're in the wrong room. I run 11 persistent agents locally. Each one is a domain specialist - its entire life is one thing. The mail agent's every session, every test, every bug fix is about routing messages. The standards auditor's whole existence is quality checks. They're not generic workers configured for a task. They've each accumulated dozens of sessions of operational history in their domain, and that history is what makes them good at their job. When they started drifting, my first instinct was what everyone's instinct is: better memory. More context. None of it helped. An agent with perfect recall of its last 50 sessions would still lose track of who it was in session 51. What actually fixed it I separated identity from memory entirely. Three files per agent: passport.json - who you are. Role, purpose, principles. Rarely changes. This is the anchor. local.json - what happened. Rolling session history, key learnings. Capped and trimmed when it fills up. observations.json - what you've noticed about the humans and agents you work with. Concrete stuff like "the git agent needs 2 retries on large diffs" or "quality audits overcorrect on technical claims." The agent writes these itself based on what actually happens. Identity loads first, then memory, then observations. That ordering matters. When the identity file loads first, the agent has a stable reference point before any history lands. The mail routing agent learned the sharpest version of this. When identity was ambiguous, it would route messages from the wrong sender. The fix wasn't better routing logic - it was: fail loud when identity is unclear. Wrong identity is worse than silence. The files alone weren't enough Three JSON files helped, but didn't scale past a few agents. What actually made 11 work is that none of them need to understand the full system. Hooks inject context automatically every session - project rules, branch instructions, current plan. One command reaches any agent. Memory auto-archives when it fills up. Plans keep work focused so agents don't carry their entire history in context. The system learned from failing. The agents communicate through a local email system - they send each other tasks, status updates, bug reports. One agent monitors all logs for errors. When it spots something, it emails the agent who owns that domain and wakes them up to investigate. The agents fix each other. The memory agent iterated three sessions to fix a single rollover boundary condition - each time it shipped, observed a new edge case, and improved. These aren't cold modules. They break, they help each other fix it, they get better. That's how the system got to where it is. You don't need 11 agents The 11 agents in my setup maintain the framework itself. That's the reference implementation. But u could start with one agent on a side project - just identity and memory, pick up where u left off tomorrow. Need a team? Add a backend agent, a frontend agent, a design researcher. Three agents, same pattern, same commands. Or scale to 30 for a bigger system. Each new agent is one command and the same structure. What this doesn't solve This all runs locally on one machine. I don't know whether identity drift looks the same in hosted environments. If u run stateless agents behind an API, the problem might not exist for you. Small project, small community, growing. The pattern itself is small enough to steal - three JSON files and a convention. But the system that keeps agents coherent at scale is where the real work went. pip install aipass and two commands to get a working agent. The .trinity/ directory is the identity layer. Has anyone else tried separating identity from memory in their agent setups? Curious whether the ordering matters in other architectures, or if it's just an artifact of how this system evolved.
AI and Dune: The Debate on Thinking and AI Assistance
The Globe and Mail's editorial board ran a piece in March titled "AI can be a crutch, or a springboard." To illustrate the crutch half, they offered this: someone asked AI to explain a passage from Dune that warns against delegating thinking to machines. Instead of reading the book. That anecdote is doing more work than the studies the editorial cites. But the studies are real. Researchers at MIT published a paper in June 2025 titled "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task" (Kosmyna et al., arXiv 2506.08872). The study tracked brain activity across three groups: people writing with ChatGPT, people using search engines, and people working unaided. The LLM group showed the weakest neural connectivity. Over four months, "LLM users consistently underperformed at neural, linguistic, and behavioral levels." The most striking finding: LLM users struggled to accurately quote their own work. They couldn't recall what they had just written. The Globe cites this and similar research to make a point about dependency. The implicit argument: hand enough of your thinking to a machine and you stop doing it yourself. That finding is probably accurate for the way most people use these tools. The question is whether that's the only way they can be used. The Globe's own title contains the counter-argument. Crutch or springboard. They wrote both words. They just didn't develop the second one. Ethan Mollick, a professor at Wharton who has been writing about AI use since the tools became widely available, argued in 2023 that the real challenge AI poses to education isn't that students will stop thinking, it's that the old structures assumed thinking was hard enough to enforce. ("The Homework Apocalypse," [oneusefulthing.org](http://oneusefulthing.org), July 2023.) When AI can do the surface-level cognitive work, the only tasks left worth assigning are the ones that require actual judgment. The tool, in that framing, doesn't reduce the demand for thinking. It raises the floor under it. Nate B. Jones, who writes and consults on what it actually takes to work well with AI, has made a sharper version of this argument. His position: using AI effectively requires more cognitive skill, not less. Specifically, it requires the ability to translate ambiguous intent into a precise, edge-case-aware specification that an AI can execute correctly. It requires detecting errors in output that is fluent and confident-sounding but wrong. It requires recognizing when an AI has drifted from your intent, or is confirming a premise it should be challenging. These are not passive skills. They are harder versions of the same thinking the MIT study found LLM users weren't doing. The difference between the group that lost neural connectivity and the group that doesn't isn't the tool. It's what they decided to do with it. Here's my own evidence. In the past year I built a working web application. Python backend. JavaScript frontend. Deployed on two hosting platforms. Payment processing. User authentication. A full data model. I do not know how to code. Every product decision was mine. Every architectural call. Every tradeoff judgment. I defined what the system needed to do, why, and what done looked like. I reviewed every significant change before it was accepted. When something broke, I identified where the breakdown was and directed the fix. The implementation was handled by AI. The thinking was mine. This mode (call it AI-directed building) is the opposite of the Dune reader. The quality of what gets produced is entirely a function of how clearly you can think, how precisely you can specify, and how critically you can evaluate what comes back. There is no shortcut in that. A vague brief to an AI doesn't produce a confused output. It produces a confident, fluent, wrong one. The discipline that prevents that is yours to supply. Non-coders building functional software with AI is common enough now that it isn't a story. What's less visible is the specificity of judgment underneath the ones that actually work. The practices that force more thinking rather than less are not complicated, but they require a decision to use the tool differently. When I've formed a position on something, I give the AI full context and ask it to make the strongest possible case against me. Ask for the hardest opposing argument it can construct. Then I read it. Sometimes it changes nothing. Sometimes it surfaces something I had dismissed without fully examining. The AI doesn't form my view. It stress-tests one I've already formed. When I'm uncertain between options, I don't ask which is better. I ask: here are two approaches, here is my constraint, now what does each cost me, and what does each require me to give up? I make the call. The AI laid out the shape of the decision. The judgment was mine. The uncomfortable part of thinking is still yours in this mode. The tool makes the work more rigorous, not easier. The MIT researchers and the Globe editorial are almost certainly right about the majority of current use. Passive use produces passive outcomes. That's not a controversial claim. The crutch half and the springboard half use the same interface. The difference is whether the person in front of it decided to think. What are you doing with it that forces more thinking rather than less? Are you using it to skip a step, or to take a harder one? Genuinely asking.