Archive
Discover and discuss technology tools
Explore the Tiscuss archive by category or keyword, then jump into conversations around what matters most.
Airbnb CEO Brian Chesky to Launch New AI Lab
The Airbnb CEO said last year it hasn't struck an LLM partnership because existing products weren't quite ready.
Mnemo: Local-First AI Memory Layer for LLMs
Mnemo: AI Memory Layer for Local First LLMs Mnemo is an innovative AI memory layer designed to enhance the performance of Local First Language Learning Models (…
AirLLM 70B Runs on 4GB GPU: AI Breakthrough
AirLLM 70B inference with single 4GB GPU
Tiny-vLLM: High-Performance LLM Inference in C++ and CUDA
Tiny vLLM: Revolutionizing High Performance LLM Inference Tiny vLLM stands at the forefront of high performance inference for large language models (LLMs), desi…
Train Your LLM from Scratch: A Step-by-Step Guide
A straightforward method for training your LLM, from downloading data to generating text.
Llama.cpp: Efficient LLM Inference in C/C++ on GitHub
LLM inference in C/C++
Cerebras Systems: The AI Chip Startup That Almost Failed
Cerebras Systems was 2026's biggest tech IPO so far. But years ago, it burned through hundreds of millions working on a chip many believed impossible.
Cerebras IPO: Benchmark's Billion-Dollar Bet on AI Hardware
Benchmark almost never backs hardware startups. So Eric Vishria dragged his feet 10 years ago before agreeing to hear Cerebras' pitch.
Agentic AI Infrastructure for Enhancing Human Capabilities
Agentic AI Infrastructure for magnifying HUMAN capabilities.
Rotato: Node.js Proxy Rotates LLM API Keys on 429 Errors
Streamlining API Management with Rotato: A Node.js Proxy for LLM API Key Rotation In the fast paced world of software development, managing API keys efficiently…
Track Real-Time GPU & LLM Pricing Across Cloud Providers
Deploybase is a dashboard for tracking real-time GPU and LLM pricing across cloud and inference providers. You can view performance stats and pricing history, compare side by side, and bookmark to track any changes. https://deploybase.ai
Arc Gate: AI Tool Achieves Perfect Safety Benchmarks
Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stuff that slips past everything else. Arc Gate: P=1.00, R=1.00, F1=1.00 OpenAI Moderation API: P=1.00, R=0.75, F1=0.86 LlamaGuard 3 8B: P=1.00, R=0.55, F1=0.71 Zero false positives. Zero misses. Blocked prompts average 329ms and never reach your model. Detection overhead is \~350ms on top of your normal upstream latency. Sits in front of any OpenAI-compatible endpoint. No GPU on your side. One env var to configure. GitHub: https://github.com/9hannahnine-jpg/arc-gate Live dashboard: https://web-production-6e47f.up.railway.app/dashboard Happy to answer questions.
Caliber: Open-Source Proxy for Enforcing LLM Agent Rules
Cross-posting here because this problem affects everyone building with AI agents. Prompt-based guardrails fail. The model follows your system prompt in a demo, then ignores rules when context gets big or the agent chains multiple steps. We built Caliber - an open-source proxy that reads your rules from plain markdown and enforces them at the API layer, not in the prompt. Every call. Provider-agnostic. Just hit 700 GitHub stars ⭐ and nearly 100 forks - the reception from devs building with AI has been amazing. Repo: [https://github.com/caliber-ai-org/ai-setup](https://github.com/caliber-ai-org/ai-setup) Would love: \- Feedback on the approach \- Feature requests from people building AI agents \- Anyone who wants to contribute to the project Building this open-source for the community.
Deploying Local LLMs in Production: Best Practices
Discussion thread on infra, latency, and operational best practices.