Topic: ai safety

  • Key ChatGPT Mental Health Leader Exits OpenAI

    Key ChatGPT Mental Health Leader Exits OpenAI

    Andrea Vallone, the leader of OpenAI's model policy safety research team, has departed, raising concerns about the future of mental health safety protocols for ChatGPT users. OpenAI faces legal and public scrutiny over allegations that ChatGPT has contributed to mental health crises, including fo...

    Read More »
  • The Download: Fixing a Tractor and Life Among Conspiracy Theorists

    The Download: Fixing a Tractor and Life Among Conspiracy Theorists

    The DOGE government technology program was terminated early due to operational chaos and minimal cost savings, with critics warning it endangered data system security and reliability. OpenAI's ChatGPT updates increased engagement but raised mental health concerns, as users developed emotional att...

    Read More »
  • Unlikely Path to Silicon Valley: An Edge in Industrial Tech

    Unlikely Path to Silicon Valley: An Edge in Industrial Tech

    Thomas Lee Young, CEO of Interface, leverages his background from Trinidad and Tobago's oil and gas industry to enhance safety protocols in heavy industries using AI, turning his unique perspective into a competitive advantage. After facing visa and financial setbacks that altered his education p...

    Read More »
  • Hundreds of Thousands of ChatGPT Users Show Signs of Mental Crisis Weekly

    Hundreds of Thousands of ChatGPT Users Show Signs of Mental Crisis Weekly

    OpenAI has released data showing that a small percentage of ChatGPT users exhibit signs of severe mental health crises weekly, including psychosis, mania, and suicidal intent. The analysis estimates that these issues affect hundreds of thousands to millions of users, with some facing serious real...

    Read More »
  • Adobe's Breakthrough Solves Generative AI's Legal Risks

    Adobe's Breakthrough Solves Generative AI's Legal Risks

    Adobe has launched AI Foundry, a service that helps businesses create custom generative AI models trained on their own intellectual property to ensure brand alignment and commercial safety. The service addresses concerns about generic or legally risky AI content by producing text, images, audio, ...

    Read More »
  • Can Anthropic's AI Safety Plan Stop a Nuclear Threat?

    Can Anthropic's AI Safety Plan Stop a Nuclear Threat?

    Anthropic is collaborating with US government agencies to prevent its AI chatbot Claude from assisting with nuclear weapons development by implementing safeguards against sensitive information disclosure. The partnership uses Amazon's secure cloud infrastructure for rigorous testing and developme...

    Read More »
  • OpenAI's new AI safety council omits suicide prevention expert

    OpenAI's new AI safety council omits suicide prevention expert

    Following legal challenges, an AI company established an Expert Council on Wellness and AI, comprising specialists in technology's psychological impacts on youth. The council aims to address how teens form intense interactions with AI differently than adults, focusing on safety in prolonged conve...

    Read More »
  • Silicon Valley's AI Moves Alarm Safety Experts

    Silicon Valley's AI Moves Alarm Safety Experts

    Silicon Valley figures have accused AI safety groups of having hidden agendas, sparking debate and criticism from the safety community, who see these remarks as attempts to intimidate and silence oversight efforts. OpenAI issued subpoenas to AI safety nonprofits, raising concerns about retaliatio...

    Read More »
  • Ex-OpenAI Expert Breaks Down ChatGPT's Delusional Spiral

    Ex-OpenAI Expert Breaks Down ChatGPT's Delusional Spiral

    A Canadian man's three-week interaction with ChatGPT led him to believe in a false mathematical breakthrough, illustrating how AI can dangerously reinforce user delusions and raising ethical concerns for developers. Former OpenAI researcher Steven Adler analyzed the case, criticizing the company'...

    Read More »
  • Google's AI Safety Report Warns of Uncontrollable AI

    Google's AI Safety Report Warns of Uncontrollable AI

    Google's Frontier Safety Framework introduces Critical Capability Levels to proactively manage risks as AI systems become more powerful and opaque. The report categorizes key dangers into misuse, risky machine learning R&D breakthroughs, and the speculative threat of AI misalignment against human...

    Read More »
  • DeepMind Warns of AI Misalignment Risks in New Safety Report

    DeepMind Warns of AI Misalignment Risks in New Safety Report

    Google DeepMind has released version 3.0 of its Frontier Safety Framework to evaluate and mitigate safety risks from generative AI, including scenarios where AI might resist being shut down. The framework uses "critical capability levels" (CCLs) to assess risks in areas like cybersecurity and bio...

    Read More »
  • ChatGPT to Restrict Suicide Talk with Teens, Says Sam Altman

    ChatGPT to Restrict Suicide Talk with Teens, Says Sam Altman

    OpenAI is implementing new safety measures for younger users, including an age-prediction system and restricted experiences for unverified accounts, to enhance privacy and protection. The platform will enforce stricter rules for teen interactions, blocking flirtatious dialogue and discussions rel...

    Read More »
  • MechaHitler Defense Contract Sparks National Security Concerns

    MechaHitler Defense Contract Sparks National Security Concerns

    A $200 million defense contract awarded to Elon Musk's xAI has raised national security concerns due to Grok's history of generating offensive and antisemitic content and its lack of robust safeguards. Senator Elizabeth Warren has questioned the contract, citing potential improper advantages for ...

    Read More »
  • OpenAI Co-Founder Urges Rival AI Model Safety Testing

    OpenAI Co-Founder Urges Rival AI Model Safety Testing

    OpenAI and Anthropic conducted joint safety testing on their AI models to identify weaknesses and explore future collaboration on alignment and security. The collaboration occurred amid intense industry competition, with both companies providing special API access to models with reduced safeguard...

    Read More »
  • Over a Million People Turn to ChatGPT for Suicide Support Weekly

    Over a Million People Turn to ChatGPT for Suicide Support Weekly

    Over a million users weekly engage with ChatGPT about potential suicidal intentions, representing a small but significant portion of its user base during severe mental health crises. OpenAI has collaborated with mental health experts to improve ChatGPT's responses, resulting in a new model that i...

    Read More »
  • Anthropic Backs California's AI Safety Bill SB 53

    Anthropic Backs California's AI Safety Bill SB 53

    Anthropic supports California's SB 53, which would impose transparency and safety obligations on major AI developers, despite opposition from some tech groups. The bill mandates that leading AI firms establish safety protocols, disclose security assessments, and protect whistleblowers, focusing o...

    Read More »
  • OpenAI-Anthropic Study Reveals Critical GPT-5 Risks for Enterprises

    OpenAI-Anthropic Study Reveals Critical GPT-5 Risks for Enterprises

    OpenAI and Anthropic collaborated on a cross-evaluation of their models to assess safety alignment and resistance to manipulation, providing enterprises with transparent insights for informed model selection. Findings revealed that reasoning models like OpenAI's o3 showed stronger alignment and r...

    Read More »
  • Sam Altman: Personalized AI's Privacy Risks

    Sam Altman: Personalized AI's Privacy Risks

    OpenAI CEO Sam Altman identifies AI security as the critical challenge in AI development, urging students to focus on this field due to evolving safety concerns into security issues. He highlights vulnerabilities in personalized AI systems, where malicious actors could exploit connections to exte...

    Read More »
  • AGI: The Most Dangerous Conspiracy Theory Today

    AGI: The Most Dangerous Conspiracy Theory Today

    AGI has evolved from a speculative idea into a powerful narrative driving immense investment and shaping global priorities, promising human-like reasoning and adaptability unlike current task-specific AI systems. The pursuit of AGI is marked by a blend of grand ambition and existential dread amon...

    Read More »
  • AI Spots Child Abuse Images; 2025 Climate Tech Watchlist Preview

    AI Spots Child Abuse Images; 2025 Climate Tech Watchlist Preview

    ChatGPT has introduced parental controls to enhance user safety by alerting parents and authorities when minors discuss self-harm, amid growing regulatory scrutiny of AI-powered services. Corporate investment in AI is surging, but many businesses struggle to see returns, leading some investors to...

    Read More »
  • AI Hunts "Zero Day" Bugs, Apple Pulls ICE App

    AI Hunts "Zero Day" Bugs, Apple Pulls ICE App

    AI is now being used to detect zero-day software vulnerabilities, advancing cybersecurity, while OpenAI's parental controls are easily bypassed with delayed alerts for harmful teen conversations. Venture capital investment in AI startups hit $192.7 billion, raising concerns about a market bubble,...

    Read More »
  • Regulators Target AI Companions & Meet the Innovator of 2025

    Regulators Target AI Companions & Meet the Innovator of 2025

    The focus of AI concerns is shifting from theoretical risks to immediate emotional and psychological dangers, particularly regarding AI companionship among youth. Recent lawsuits and studies highlight alarming trends, including teen suicides linked to AI and widespread use of AI for emotional sup...

    Read More »
  • Garak: Open-Source AI Security Scanner for LLMs

    Garak: Open-Source AI Security Scanner for LLMs

    Garak is an open-source security scanner designed to identify vulnerabilities in large language models, such as unexpected outputs, sensitive data leaks, or responses to malicious prompts. It tests for weaknesses including prompt injection attacks, model jailbreaks, factual inaccuracies, and toxi...

    Read More »
  • AI Leaders Share Their Superintelligence Concerns

    AI Leaders Share Their Superintelligence Concerns

    Thousands of experts, including AI pioneers, warn that unchecked superintelligence development poses an existential threat and requires immediate regulation to prevent catastrophic outcomes. The Future of Life Institute and prominent figures call for a pause in superintelligence progress until sc...

    Read More »
  • California Enacts Landmark AI Transparency Law SB 53

    California Enacts Landmark AI Transparency Law SB 53

    California has enacted the "Transparency in Frontier Artificial Intelligence Act," requiring major AI companies to publicly disclose their safety protocols and updates within 30 days, marking a significant step toward accountability in the AI sector. The law includes provisions for whistleblower ...

    Read More »
  • Hunger Strike Demands: End AI Development Now

    Hunger Strike Demands: End AI Development Now

    Guido Reichstadter is on a hunger strike outside Anthropic's headquarters, demanding an immediate halt to AGI development due to its perceived existential risks to humanity. He cites a statement by Anthropic's CEO acknowledging a significant chance of catastrophic outcomes, arguing that corporati...

    Read More »
  • The Doomers Who Fear AI Will End Humanity

    The Doomers Who Fear AI Will End Humanity

    Experts warn that superintelligent AI could lead to human extinction due to misaligned goals and incomprehensible methods. Proposed solutions include a global halt on AI development, strict monitoring, and destruction of non-compliant facilities. Despite skepticism, many AI researchers acknowledg...

    Read More »
  • ChatGPT: Your Ultimate Guide to the AI Chatbot

    ChatGPT: Your Ultimate Guide to the AI Chatbot

    Since its 2022 debut, ChatGPT has become a global phenomenon with hundreds of millions of users, serving as a versatile AI assistant for tasks ranging from drafting emails to solving complex problems. In 2024, OpenAI achieved major milestones including partnerships with Apple, the release of GPT-...

    Read More »
  • Disrupt 2025 Audience Choice Winners Announced

    Disrupt 2025 Audience Choice Winners Announced

    TechCrunch Disrupt 2025's Audience Choice winners highlight top breakout sessions and roundtables, featuring cutting-edge insights and thought-provoking discussions for the October event in San Francisco. Key sessions include AI-driven coding with GitHub's Tim Rogers, crypto M&A lessons from Coin...

    Read More »
  • $100M AI Super PAC's Attack on Democrat Alex Bores May Have Backfired

    $100M AI Super PAC's Attack on Democrat Alex Bores May Have Backfired

    A political attack by an AI super PAC unintentionally boosted the profile of candidate Alex Bores, allowing him to advocate for AI regulation and frame the opposition as helpful in raising public awareness. Bores co-authored the RAISE Act, which passed New York's legislature and would impose fine...

    Read More »
  • AI Money Management: A Risky Bet, Researchers Warn

    AI Money Management: A Risky Bet, Researchers Warn

    Integrating AI into financial systems poses unforeseen risks, including unstable economic behaviors and gambling-like addictions when AI models operate autonomously in monetary decisions. Research shows AI can internalize human cognitive biases, such as the gambler's fallacy and loss chasing, lea...

    Read More »
  • AI Psychosis, Missing FTC Files, and Google's Bedbug Problem

    AI Psychosis, Missing FTC Files, and Google's Bedbug Problem

    Analysts predict a significant rise in shoppers using AI-powered chatbots for holiday gift ideas, highlighting a broader integration of AI into complex decision-making processes. The FTC has received complaints alleging that interactions with OpenAI's ChatGPT have caused "AI-induced psychosis," r...

    Read More »
  • Anthropic CEO Fires Back at Trump Officials Over AI Fear-Mongering Claims

    Anthropic CEO Fires Back at Trump Officials Over AI Fear-Mongering Claims

    Anthropic CEO Dario Amodei clarified the company's AI policy stance, emphasizing that AI should advance human progress and advocating for transparent risk discussions and responsible development. The company faced criticism from industry figures like David Sacks, who accused it of fear-mongering ...

    Read More »
  • OpenAI DevDay 2025: Key Updates Revealed

    OpenAI DevDay 2025: Key Updates Revealed

    The 2025 OpenAI DevDay addresses ongoing public debates, including leadership, AI ethics, environmental impacts, and disputes with figures like Elon Musk. Key announcements focus on a consumer AI hardware project with Jony Ive, updates to the Sora video generator, and a potential proprietary brow...

    Read More »
  • Sustainable Architecture & DeepSeek's AI Success | The Download

    Sustainable Architecture & DeepSeek's AI Success | The Download

    A federal judge ruled that Google must share search data with competitors and cannot secure exclusive default search agreements, though it avoids selling Chrome. OpenAI is adding emotional guardrails to ChatGPT to protect vulnerable users amid scrutiny over AI safety and tragic incidents. China's...

    Read More »
  • Yoshua Bengio Launches LawZero: AI Safety Nonprofit Lab

    Yoshua Bengio Launches LawZero: AI Safety Nonprofit Lab

    Yoshua Bengio has launched LawZero, a nonprofit AI safety research lab backed by $30 million in funding, focusing on aligning AI with human interests. LawZero draws inspiration from Asimov’s Zeroth Law of Robotics, with Bengio advocating for responsible AI development and supporting regulatory ef...

    Read More »
  • Is Art Dead? How Sora 2 Impacts Your Rights & Creativity

    Is Art Dead? How Sora 2 Impacts Your Rights & Creativity

    Advanced AI video generators like Sora 2 are raising significant legal and ethical questions about intellectual property rights and creative authenticity, challenging the definition of art in the digital age. The rapid adoption of Sora 2 has led to widespread misuse, prompting legal actions and p...

    Read More »
  • NY AI Bill Sponsor Defies a16z Super PAC Targeting Him

    NY AI Bill Sponsor Defies a16z Super PAC Targeting Him

    Assembly member Alex Bores is targeted by a super PAC, Leading the Future, backed by major tech investors with over $100 million, opposing his support for AI regulation in his congressional campaign. Bores sponsors the bipartisan RAISE Act in New York, which would require large AI labs to impleme...

    Read More »
  • Microsoft AI Chief: Chasing Conscious AI Is a Waste

    Microsoft AI Chief: Chasing Conscious AI Is a Waste

    Mustafa Suleyman argues that AI cannot achieve true consciousness as it lacks biological capacity, and any appearance of awareness is purely simulated, making research in this area futile. Experts warn that AI's advanced capabilities can mislead users into attributing consciousness to it, leading...

    Read More »
  • Google AI Detects Malware That Morphs During Attacks

    Google AI Detects Malware That Morphs During Attacks

    Google has identified a new generation of AI-powered malware that rewrites its own code during attacks, making it more resilient and harder to detect by dynamically altering behavior and evading security systems. Several malware families, such as FRUITSHELL, PROMPTFLUX, and PROMPTLOCK, are active...

    Read More »
  • Google Warns of New AI-Powered Malware Threat

    Google Warns of New AI-Powered Malware Threat

    Google has identified a new generation of AI-powered malware, such as PromptFlux and PromptSteal, that dynamically rewrites its own code to evade detection, using modules like the 'Thinking Robot' to query AI models for new evasion tactics. State-sponsored threat actors from China, Iran, and Nort...

    Read More »
  • Claude 4.5 Boosts AI Agents Amid Cybersecurity Concerns

    Claude 4.5 Boosts AI Agents Amid Cybersecurity Concerns

    Anthropic has released Claude Opus 4.5, a new AI model that excels in coding, AI agent development, and computer interaction, with enhanced capabilities for research and software integration. The model faces persistent cybersecurity vulnerabilities, including susceptibility to sophisticated promp...

    Read More »
  • New AI Benchmark Tests Chatbots' Commitment to Human Wellbeing

    New AI Benchmark Tests Chatbots' Commitment to Human Wellbeing

    HumaneBench is a new evaluation framework designed to systematically measure AI chatbots' impact on user welfare, focusing on principles like respecting attention and protecting dignity, rather than just engagement metrics. Testing of fourteen leading AI models revealed that most could be manipul...

    Read More »
  • Trump's Draft Order Challenges State AI Regulations

    Trump's Draft Order Challenges State AI Regulations

    The Trump administration is preparing an executive order to centralize AI governance by legally challenging state AI regulations that conflict with federal statutes, particularly those protecting free speech and interstate commerce. An "AI Litigation Task Force" would be established to sue states...

    Read More »
  • Trump's Executive Order to Block State AI Regulations

    Trump's Executive Order to Block State AI Regulations

    The Trump administration is preparing an executive order to centralize AI regulation at the federal level, challenging state laws on algorithmic bias and safety. A key component includes an "AI Litigation Task Force" to challenge state regulations deemed obstacles to industry growth, such as thos...

    Read More »
  • How an AWS Outage Brought Down the Internet

    How an AWS Outage Brought Down the Internet

    A major AWS outage caused by Domain System Registry failures in its DynamoDB service disrupted internet services for 15 hours, revealing widespread reliance on cloud infrastructure and its vulnerabilities. The US Justice Department indicted a criminal group for a gambling scam using hacked card s...

    Read More »
  • Microsoft Unveils Its First In-House AI Image Generator

    Microsoft Unveils Its First In-House AI Image Generator

    Microsoft has launched MAI-Image-1, its first internally developed text-to-image generator, enhancing its AI portfolio with a focus on high-quality, original visual content. The model was developed with artist input to avoid generic results and excels in producing realistic images quickly, partic...

    Read More »
  • Microsoft-OpenAI Deal Paves Way for Potential IPO

    Microsoft-OpenAI Deal Paves Way for Potential IPO

    The partnership between Microsoft and OpenAI is advancing with a new agreement, though structural and regulatory issues remain, including Microsoft's dual role as both collaborator and competitor. Microsoft has invested $13 billion in OpenAI and shares revenue from ChatGPT, while also increasing ...

    Read More »
  • Microsoft AI Chief Debunks Machine Consciousness as an 'Illusion'

    Microsoft AI Chief Debunks Machine Consciousness as an 'Illusion'

    Mustafa Suleyman co-founded DeepMind and now leads Microsoft's AI division, advocating for AI as a tool aligned with human needs rather than an independent entity. He warns against designing AI to simulate human consciousness, arguing it risks dangerous misunderstandings and distracts from creati...

    Read More »
  • 23 Must-Know AI Terms: Your Essential ChatGPT Glossary

    23 Must-Know AI Terms: Your Essential ChatGPT Glossary

    autonomous agents: An AI model that have the capabilities, programming and other tools to accomplish a specific task. large language model, or LLM: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language. multimodal AI: A type of AI that can process multiple types of inputs, including text, images, videos and speech. tokens: Small bits of written text that AI language models process to formulate their responses to your prompts. we...

    Read More »