Shuaib Mohammad smartexpert

llama.cpp `SOFT_MAX failed: invalid argument` on Blackwell consumer (sm_120) — root cause + 12-line fix

If you searched for this error, you're in the right place:

ggml_cuda_compute_forward: SOFT_MAX failed
CUDA error: invalid argument
  current device: 0, in function ggml_cuda_compute_forward at ggml/src/ggml-cuda/ggml-cuda.cu:2962

	<!doctype html>
	<html lang="en"><head>
	<meta charset="utf-8" />
	<meta name="viewport" content="width=device-width,initial-scale=1" />
	<title>OpenRouter × GPU Breakeven</title>
	<script src="https://cdn.tailwindcss.com"></script>
	<style>
	:root { color-scheme: dark; }
	html, body { background: #08090a; min-height: 100%; }
	body { font-family: 'Inter', ui-sans-serif, system-ui, -apple-system, 'Segoe UI', Roboto, sans-serif; -webkit-font-smoothing: antialiased; }

	[
	{
	"Extension": ".academy",
	"Current fee (per year)": "$25.18",
	"New fee (per year)": "$33.18",
	"Effective from": "October 15, 2024"
	},
	{
	"Extension": ".accountants",
	"Current fee (per year)": "$75.18",

	import json
	import os
	import string
	import zipfile

	from lxml import etree
	from nltk.tokenize import RegexpTokenizer
	from tqdm import tqdm

	import docx


	/*
	the twitter api is stupid. it is stupid and bad and expensive. hence, this.

	Literally just paste this in the JS console on the bookmarks tab and the script will automatically scroll to the bottom of your bookmarks and keep a track of them as it goes.

	When finished, it downloads a JSON file containing the raw text content of every bookmark.

	for now it stores just the text inside the tweet itself, but if you're reading this why don't you go ahead and try to also store other information (author, tweetLink, pictures, everything). come on. do it. please?
	*/

	<center>
	<br/>
	<h1>Announcement</h1>
	<p>You will be redirected in <span id="seconds-holder"></span> seconds.<br/><br/> If your browser does not redirect you automatically click <a href="" id="link">here</a>.
	</p>
	</center>

	# Logs
	logs
	*.log
	npm-debug.log*
	yarn-debug.log*
	yarn-error.log*
	lerna-debug.log*
	.pnpm-debug.log*

	# Diagnostic reports (https://nodejs.org/api/report.html)

	[
	{
	"rate": 0.133411,
	"source": "INR",
	"target": "SEK",
	"time": "2022-11-01T17:51:04+0000"
	},
	{
	"rate": 7.49562,
	"source": "SEK",

	[tool.poetry]
	name = "whisper-play"
	version = "0.1.0"
	description = ""
	authors = ["Your Name <you@example.com>"]
	readme = "README.md"
	packages = [{include = "whisper_play"}]

	[tool.poetry.dependencies]
	python = "^3.8"

Shuaib Mohammad smartexpert

llama.cpp SOFT_MAX failed: invalid argument on Blackwell consumer (sm_120) — root cause + 12-line fix

llama.cpp `SOFT_MAX failed: invalid argument` on Blackwell consumer (sm_120) — root cause + 12-line fix