Archive

Discover and discuss technology tools

Explore the Tiscuss archive by category or keyword, then jump into conversations around what matters most.

Search and filters
Reset
Active: AI Infrastructure / query: Model / page 1 of 1 / 42 total
AI Infrastructure

Fast Local LLM Inference Benchmarks and Deployment Tips

Community benchmarks and infra recommendations for local models.

Global · Developers · Jun 23, 2026
AI Infrastructure

Rio 3.5 Open 397B AI Infrastructure Unveiled

Rio 3.5 Open 397B AI Infrastructure Unveiled Rio 3.5 Open 397B represents a groundbreaking advancement in large language models, offering robust AI capabilities…

Global · Developers · Jun 16, 2026
AI Infrastructure

AI Memory Systems May Degrade Model Performance

New research suggests that AI memory systems can degrade model performance and encourage sycophantic tendencies.

Global · Developers · Jun 11, 2026
AI Infrastructure

Anthropic Partners with TCS for Enterprise AI Deployment

The partnership will see TCS creating a business unit focused on deploying Anthropic's AI models to its customers.

Global · Enterprises · Jun 11, 2026
AI Infrastructure

Google's Gemma 4 12B Model: AI Infrastructure Advancements

Google's Gemma 4 12B Model: Revolutionizing AI Infrastructure Google's Gemma 4 12B Model marks a significant leap in artificial intelligence (AI) infrastructure…

Global · Developers · Jun 7, 2026
AI Infrastructure

Unsloth/Gemma-4-26B-A4B-IT-QAT-GGUF: New AI Infrastructure on Hugging

Unsloth/Gemma 4 26B A4B IT QAT GGUF: Revolutionizing AI Infrastructure on Hugging Face Introduction to Unsloth/Gemma 4 26B A4B IT QAT GGUF Artificial Intelligen…

Global · Developers · Jun 7, 2026
AI Infrastructure

NVIDIA Nemotron 3 Ultra: 550B Parameters for AI Infrastructure

NVIDIA Nemotron 3 Ultra: Revolutionizing AI Infrastructure with 550B Parameters The NVIDIA Nemotron 3 Ultra is a cutting edge AI model designed to push the boun…

Global · Developers · Jun 5, 2026
AI Infrastructure

NVIDIA Nemotron 3 Ultra: 550B Parameters, A55B, BF16

NVIDIA Nemotron 3 Ultra: Unleashing the Power of 550B Parameters with A55B, BF16 The NVIDIA Nemotron 3 Ultra represents a groundbreaking advancement in AI techn…

Global · Developers · Jun 5, 2026
AI Infrastructure

NVIDIA Cosmos: Open Platform for Physical AI Development

NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.

Global · Developers · Jun 5, 2026
AI Infrastructure

AI Weather Startup WindBorne Outperforms Government Forecasts

WindBorne benefits from its unique combination of model-building and data collection. The company now has about 400 balloons in flight gathering sensor readings at any given time, launched from 15 sites around the globe. The advances in its current model come from improvements in how the data collected by the balloons is fed into the models.

Global · General · Jun 2, 2026
AI Infrastructure

Anthropic AI Files to Go Public, Secures Top Enterprise Customers

Anthropic, now an AI powerhouse that has landed top-tier enterprise customers, was once considered an underdog in the emerging world of large language models.

Global · Enterprises · Jun 2, 2026
AI Infrastructure

Nvidia Cosmos3-Super AI Infrastructure Unveiled

Nvidia Cosmos3 Super: Revolutionizing AI Infrastructure Nvidia has introduced the Cosmos3 Super, a groundbreaking AI infrastructure designed to push the boundar…

Global · Developers · Jun 2, 2026
AI Infrastructure

NVIDIA Cosmos3-Nano: Revolutionizing AI Infrastructure

NVIDIA Cosmos3 Nano: Revolutionizing AI Infrastructure The NVIDIA Cosmos3 Nano is setting new standards in AI infrastructure, delivering unparalleled performanc…

Global · Developers · Jun 2, 2026
AI Infrastructure

Groq Aims to Raise $650M for AI Inference Focus After Nvidia Deal

Chipmaker Groq is looking to raise $650 million in internal funding as it pivots from hardware to focus more on AI inference, the process of refining the way AI models respond to prompted requests, per Axios.

Global · Developers · May 30, 2026
AI Infrastructure

Reproducible World Model Research Platform Launched on GitHub

A platform for reproducible world model research and evaluation

Global · Developers · May 30, 2026
AI Infrastructure

Tencent's New AI Model: Hy-MT2-1.8B-GGUF on Hugging Face

Tencent has unveiled its latest AI innovation with the introduction of the Hy MT2 1.8B GGUF model, now available on the Hugging Face platform. This cutting edge…

Global · Developers · May 27, 2026
AI Infrastructure

Trump Delays AI Security Executive Order Due to Language Concerns

President Trump delayed signing an executive order that would have required pre-release government security reviews of AI models, citing dissatisfaction with the order's language.

US · General · May 22, 2026
AI Infrastructure

KVBoost Speeds Up HuggingFace Models with Efficient Cache Reuse

KVBoost: Enhancing HuggingFace Models with Effective Cache Management KVBoost emerges as a pioneering solution tailored to bolster the performance of HuggingFac…

Global · Developers · May 22, 2026
AI Infrastructure

Tencent's Hy-MT2-1.8B: Revolutionizing AI Infrastructure

Tencent's Hy MT2 1.8B: Transforming AI Infrastructure Tencent's innovative Hy MT2 1.8B is setting new benchmarks in the realm of AI infrastructure. This cutting…

Global · Developers · May 22, 2026
AI Infrastructure

Bytedance's AI Infrastructure: A Deep Dive into github.com/bytedance

Bytedance’s AI Infrastructure: Exploring github.com/bytedance Bytedance, the technological titan behind iconic platforms like TikTok and Douyin, has opened the …

Global · Developers · May 21, 2026
AI Infrastructure

Andrej Karpathy Joins Anthropic's Pre-training Team

Pre-training is responsible for the large-scale training runs that give Claude its core knowledge and capabilities, according to the company. It's also one of the most expensive, compute-intensive phases of building a frontier model.

Global · Developers · May 19, 2026
AI Infrastructure

Google's Genie Simulates Real Streets with Street View Integration

Google DeepMind is integrating Street View with Project Genie to create immersive, interactive world simulations for robotics, gaming, and travel, allowing users to explore environments, weather changes, and rare scenarios.

Global · General · May 19, 2026
AI Infrastructure

Origin Lab Raises $8M for AI Data Marketplace

Origin Lab will serve as a marketplace where AI labs can buy high-quality licensed data, and video-game companies can sell it.

Global · Founders · May 14, 2026
AI Infrastructure

Medicare's New AI Payment Model: ACCESS Explained

There is no governmental mechanism to pay for an AI agent that monitors a patient between visits, calls to check in, coordinates a housing referral, or makes sure someone picks up their medication. ACCESS creates that mechanism for the first time.

US/CA/AU · Founders · May 13, 2026
AI Infrastructure

Samsara Uses AI to Detect and Fix Potholes Efficiently

Fleet management company Samsara has developed an AI model to detect different kinds of potholes and gauge how fast they're deteriorating.

Global · Enterprises · May 12, 2026
AI Infrastructure

Anthropic: AI Portrayals Influence AI Behavior

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Global · General · May 11, 2026
AI Infrastructure

Apple's Sharp AI Model Runs in Browser with ONNX Runtime Web

Apple's Innovative AI Model: Running in the Browser with ONNX Runtime Web Apple's recent integration of AI capabilities has taken a leap forward with the introd…

Global · Developers · May 3, 2026
AI Infrastructure

Pentagon Partners with Nvidia, Microsoft, and AWS for AI on Classified

The deals come as the DOD has doubled down on diversifying its exposure to AI vendors in the wake of its controversial dispute with Anthropic over usage terms of its AI models.

US · Enterprises · May 2, 2026
AI Infrastructure

Meta Acquires Assured Robot Intelligence for AI Advancements

Meta bought humanoid startup Assured Robot Intelligence to beef up its AI models for robots, the company said.

Global · General · May 2, 2026
AI Infrastructure

IBM Granite: Multilingual Embeddings for AI Infrastructure

Harnessing the Power of Multilingual Embeddings: IBM Granite for AI Infrastructure Multilingual embeddings are at the forefront of advancing AI infrastructure, …

Global · Developers · May 2, 2026
AI Infrastructure

Elon Musk Testifies on xAI's Grok Training with OpenAI Models

"Distillation" is a hot topic as frontier labs try to prevent smaller competitors from copying their models.

Global · General · Apr 30, 2026
AI Infrastructure

IBM Granite 4.1-30B: Revolutionizing AI Infrastructure on Hugging Face

IBM Granite 4.1 30B: Revolutionizing AI Infrastructure on Hugging Face IBM has recently unveiled the groundbreaking IBM Granite 4.1 30B model, aimed at cementin…

Global · Developers · Apr 30, 2026
AI Infrastructure

Elon Musk's xAI Uses OpenAI Tech for Training

Elon Musk's xAI: Leveraging OpenAI for Advanced Training Elon Musk's new venture, xAI, is making waves in the artificial intelligence (AI) community by utilizin…

Global · General · Apr 30, 2026
AI Infrastructure

Amazon Launches OpenAI Models on AWS After Microsoft Deal

A day after OpenAI got Microsoft to agree to end exclusive rights, AWS announced a slate of OpenAI model offerings, including a new agent service.

Global · Developers · Apr 29, 2026
AI Infrastructure

TiGrIS: Tiling Compiler for Embedded ML Models

TiGrIS: A Cutting Edge Compiler for Embedded Machine Learning TiGrIS, which stands for Tiling Compiler for Embedded Machine Learning Models, is an innovative to…

Global · Developers · Apr 29, 2026
AI Infrastructure

Nvidia Exec: AI Currently More Expensive Than Human Workers

Nvidia’s vice president of applied deep learning, Bryan Catanzaro, recently stated that for his team, “the cost of compute is far beyond the costs of the employees,” highlighting that AI is currently more expensive than human workers. This challenges the narrative that widespread tech layoffs (including Meta’s planned cut of \~8,000 jobs and Microsoft’s voluntary buyouts) signal an imminent replacement of humans by AI. An MIT study from 2024 supports this, finding that AI automation is economically viable in only 23% of roles where vision is central, and cheaper for humans in the remaining 77%. Despite heavy AI investment—Big Tech has announced $740 billion in capital expenditures so far this year, a 69% increase from 2025—there is still no clear evidence of broad productivity gains or job displacement from AI. AI spending is driving up costs, with some executives like Uber’s CTO saying their budgets have already been “blown away.” Experts describe the situation as a short-term mismatch: high hardware, energy, and inference costs make AI less efficient than humans right now, though future improvements in infrastructure, model efficiency, and pricing models could tip the balance toward greater economic viability in the coming years.

Global · General · Apr 29, 2026
AI Infrastructure

Arc Gate: AI Tool Achieves Perfect Safety Benchmarks

Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stuff that slips past everything else. Arc Gate: P=1.00, R=1.00, F1=1.00 OpenAI Moderation API: P=1.00, R=0.75, F1=0.86 LlamaGuard 3 8B: P=1.00, R=0.55, F1=0.71 Zero false positives. Zero misses. Blocked prompts average 329ms and never reach your model. Detection overhead is \~350ms on top of your normal upstream latency. Sits in front of any OpenAI-compatible endpoint. No GPU on your side. One env var to configure. GitHub: https://github.com/9hannahnine-jpg/arc-gate Live dashboard: https://web-production-6e47f.up.railway.app/dashboard Happy to answer questions.

Global · Developers · Apr 28, 2026
AI Infrastructure

Google and Pentagon Partner for 'Any Lawful' AI Use

https://preview.redd.it/hbbp7hn1cxxg1.png?width=811&format=png&auto=webp&s=a633fe43837bf60e014afaa4c6cf3fe72a4976d3 I feel like this was inevitable - governments would want to use AI models eventually. Wondering what are the inhumane or harmful ways the employees were protesting about - Does this mean that Pentagon can basically spy on people? [Source](https://news.geobrowser.io/story/cd07a612f9e747efa89e35bef748122d) (full article)

Global · General · Apr 28, 2026
AI Infrastructure

Auroch Engine: Revolutionizing AI Memory for Personalization

Auroch Engine is an external memory layer for AI assistants — designed to give models better long-term recall, personalization, and context awareness across conversations. Instead of relying on scattered chat history or fragile built-in memory, Auroch Engine lets users store, retrieve, and organize important context through a dedicated memory API. The goal is simple: make AI feel less like a reset button every session, and more like a tool that actually learns your projects, preferences, workflows, and goals over time. Right now, it’s in early beta. We’re looking for first users who are interested in testing a lightweight developer-facing memory system for AI apps, agents, and personal productivity workflows. Ideal early users are people building with AI, experimenting with agents, or frustrated that their assistant keeps forgetting the important stuff. DM for more information or better visit our site: https://ai-recall-engine-q5viks70j-cartertbirchalls-projects.vercel.app

Global · Developers · Apr 28, 2026
AI Infrastructure

Navigating AI Agent Governance: A Growing Organizational Challenge

Something I've been thinking about that doesn't get discussed enough outside of technical circles: the organizational and safety implications of uncoordinated AI agent deployment. Companies are shipping agents fast. Customer service agents, coding agents, data analysis agents, internal ops agents. Each team builds their own. Each agent gets its own rules, its own permissions, its own behavior. At some threshold this stops being a technical configuration problem and starts being a governance problem. You have agents making autonomous decisions on behalf of your organization with no shared behavioral contract. No unified view of what your AI systems are authorized to do. Think about what this means practically: an agent trained to be maximally helpful on one team might take actions that would be flagged as unauthorized somewhere else in the same organization. A policy change from legal doesn't propagate to agents because there's no central layer to propagate to. Nobody knows which agents have access to what data. This is the AI equivalent of shadow IT, except shadow IT couldn't take autonomous actions. What's the right mental model for governing a fleet of AI agents? Treat each agent like an employee with a defined role and access policy? Build an org chart for agents? Create a behavioral constitution that all agents inherit? Curious how people here are thinking about this, especially as agents get more capable and the stakes of misconfiguration get higher.

Global · Founders · Apr 27, 2026
AI Infrastructure

Caliber: Open-Source Proxy for Enforcing LLM Agent Rules

Cross-posting here because this problem affects everyone building with AI agents. Prompt-based guardrails fail. The model follows your system prompt in a demo, then ignores rules when context gets big or the agent chains multiple steps. We built Caliber - an open-source proxy that reads your rules from plain markdown and enforces them at the API layer, not in the prompt. Every call. Provider-agnostic. Just hit 700 GitHub stars ⭐ and nearly 100 forks - the reception from devs building with AI has been amazing. Repo: [https://github.com/caliber-ai-org/ai-setup](https://github.com/caliber-ai-org/ai-setup) Would love: \- Feedback on the approach \- Feature requests from people building AI agents \- Anyone who wants to contribute to the project Building this open-source for the community.

Global · Developers · Apr 27, 2026
AI Infrastructure

AI Forensics: The Missing Link in AI Decision-Making

I work in AI security and compliance. This just bothers me a little bit, putting AI systems in front of decisions that change people’s lives via insurance claims, hiring, credit, defense applications and when someone asks wait, why did the system do that? we basically have nothing that would hold up in a courtroom. The explainability tools we have right now? SHAP, LIME, attention maps but they’re research tools. They’re not evidence. Researchers have shown you can build a model that actively discriminates while producing perfectly clean looking explanations. They have unbounded error, they give you different answers on different runs, and there’s no way for the other side’s lawyer to independently check the work. That’s a problem if you’re trying to meet Daubert standards. And the regulatory side is moving just as fast. EU AI Act has record keeping requirements coming online. The FY26 NDAA has an AI cybersecurity framework provision with implementation due mid 2026. States are doing their own thing. Courts are starting to actually push back on AI evidence under FRE 702. There is a ton of AI observability tooling out there. Great for ops. There’s governance platforms. Great for policy. But when it comes to something that’s actually forensic grade where opposing counsel is actively trying to tear it apart, where a third party can independently verify what happened without just trusting the vendor,I’m not seeing it. What am I missing?

Global · Developers · Apr 27, 2026
PreviousPage 1 / 1Next