[{"data":1,"prerenderedAt":2405},["ShallowReactive",2],{"blog-en-how-to-count-words-javascript":3},{"id":4,"title":5,"alt":6,"author":7,"body":8,"category":2374,"description":2375,"extension":2376,"faq":2377,"image":2392,"meta":2393,"navigation":1671,"path":2394,"publishedAt":2395,"seo":2396,"stem":2397,"tags":2398,"__hash__":2404},"blog\u002Fen\u002Fhow-to-count-words-javascript.md","How to Count Words in JavaScript (5 Methods Compared)","JavaScript word count comparison table: split vs regex vs Intl.Segmenter benchmark for Unicode and CJK languages","Vibe Apps Pro Team",{"type":9,"value":10,"toc":2341},"minimark",[11,19,26,29,34,37,102,105,107,111,119,196,199,264,270,272,280,352,367,444,462,467,469,477,549,554,591,652,657,659,667,729,749,843,846,851,853,860,1026,1042,1056,1158,1163,1200,1202,1206,1338,1343,1350,1352,1356,1359,1362,1379,1389,1556,1565,1567,1571,1632,1634,1638,1641,1836,1839,1841,1845,1856,1870,1884,1886,1890,1893,2175,2184,2186,2190,2194,2208,2215,2235,2242,2255,2259,2264,2268,2274,2278,2298,2302,2310,2314,2327,2329,2337],[12,13,14,18],"p",{},[15,16,17],"code",{},"text.split(' ').length",". Every junior JS dev has shipped this. It's wrong in at least three distinct ways.",[12,20,21,22,25],{},"This is a full breakdown of five approaches to word counting in JavaScript — what each one actually does under the hood, where it breaks, and which one you should use. Spoiler: it's ",[15,23,24],{},"Intl.Segmenter",".",[27,28],"hr",{},[30,31,33],"h2",{"id":32},"why-word-counting-is-harder-than-it-looks","Why Word Counting Is Harder Than It Looks",[12,35,36],{},"Text feels simple. Words are separated by spaces, right? Except:",[38,39,40,52,62,72,78,88],"ul",{},[41,42,43,47,48,51],"li",{},[44,45,46],"strong",{},"Double spaces"," between sentences (",[15,49,50],{},"hello  world"," → split gives 3 tokens, not 2)",[41,53,54,57,58,61],{},[44,55,56],{},"Non-breaking spaces"," (",[15,59,60],{}," ",") — invisible in the browser, pasted constantly from Word, PDFs, and Google Docs",[41,63,64,67,68,71],{},[44,65,66],{},"Tabs and newlines"," — valid whitespace that ",[15,69,70],{},"split(' ')"," ignores",[41,73,74,77],{},[44,75,76],{},"CJK text"," — Chinese, Japanese, Korean have no spaces between words at all",[41,79,80,83,84,87],{},[44,81,82],{},"Emoji"," — a family emoji (",[15,85,86],{},"👨‍👩‍👧‍👦",") is 1 visible character but 11 UTF-16 code units, 6 Unicode code points, and 1 grapheme cluster",[41,89,90,93,94,97,98,101],{},[44,91,92],{},"Contractions and hyphenation"," — ",[15,95,96],{},"don't",", ",[15,99,100],{},"state-of-the-art"," — should that be 1 word or 2?",[12,103,104],{},"Most counting bugs are invisible until a non-English user hits your app.",[27,106],{},[30,108,110],{"id":109},"the-5-methods","The 5 Methods",[112,113,115,116,118],"h3",{"id":114},"method-1-textsplit-length-the-naive-split","Method 1: ",[15,117,17],{}," — The Naive Split",[120,121,126],"pre",{"className":122,"code":123,"language":124,"meta":125,"style":125},"language-js shiki shiki-themes material-theme-lighter material-theme material-theme-palenight","function countWords(text) {\n  return text.split(' ').length;\n}\n","js","",[15,127,128,155,190],{"__ignoreMap":125},[129,130,133,137,141,145,149,152],"span",{"class":131,"line":132},"line",1,[129,134,136],{"class":135},"spNyl","function",[129,138,140],{"class":139},"s2Zo4"," countWords",[129,142,144],{"class":143},"sMK4o","(",[129,146,148],{"class":147},"sHdIc","text",[129,150,151],{"class":143},")",[129,153,154],{"class":143}," {\n",[129,156,158,162,166,168,171,174,177,180,182,184,187],{"class":131,"line":157},2,[129,159,161],{"class":160},"s7zQu","  return",[129,163,165],{"class":164},"sTEyZ"," text",[129,167,25],{"class":143},[129,169,170],{"class":139},"split",[129,172,144],{"class":173},"swJcz",[129,175,176],{"class":143},"'",[129,178,179],{"class":143}," '",[129,181,151],{"class":173},[129,183,25],{"class":143},[129,185,186],{"class":164},"length",[129,188,189],{"class":143},";\n",[129,191,193],{"class":131,"line":192},3,[129,194,195],{"class":143},"}\n",[12,197,198],{},"This is the first thing people write. It's wrong immediately.",[120,200,202],{"className":122,"code":201,"language":124,"meta":125,"style":125},"countWords('hello  world')  \u002F\u002F → 3 (extra empty string token)\ncountWords('hello\\tworld')  \u002F\u002F → 1 (tab not counted as separator)\ncountWords('')              \u002F\u002F → 1 (empty string gives [''], not [])\n",[15,203,204,225,249],{"__ignoreMap":125},[129,205,206,209,211,213,216,218,221],{"class":131,"line":132},[129,207,208],{"class":139},"countWords",[129,210,144],{"class":164},[129,212,176],{"class":143},[129,214,50],{"class":215},"sfazB",[129,217,176],{"class":143},[129,219,220],{"class":164},")  ",[129,222,224],{"class":223},"sHwdD","\u002F\u002F → 3 (extra empty string token)\n",[129,226,227,229,231,233,236,239,242,244,246],{"class":131,"line":157},[129,228,208],{"class":139},[129,230,144],{"class":164},[129,232,176],{"class":143},[129,234,235],{"class":215},"hello",[129,237,238],{"class":164},"\\t",[129,240,241],{"class":215},"world",[129,243,176],{"class":143},[129,245,220],{"class":164},[129,247,248],{"class":223},"\u002F\u002F → 1 (tab not counted as separator)\n",[129,250,251,253,255,258,261],{"class":131,"line":192},[129,252,208],{"class":139},[129,254,144],{"class":164},[129,256,257],{"class":143},"''",[129,259,260],{"class":164},")              ",[129,262,263],{"class":223},"\u002F\u002F → 1 (empty string gives [''], not [])\n",[12,265,266,269],{},[44,267,268],{},"Verdict:"," Don't ship this. Ever.",[27,271],{},[112,273,275,276,279],{"id":274},"method-2-texttrimsplitsfilterbooleanlength-the-patched-split","Method 2: ",[15,277,278],{},"text.trim().split(\u002F\\s+\u002F).filter(Boolean).length"," — The Patched Split",[120,281,283],{"className":122,"code":282,"language":124,"meta":125,"style":125},"function countWords(text) {\n  return text.trim().split(\u002F\\s+\u002F).filter(Boolean).length;\n}\n",[15,284,285,299,348],{"__ignoreMap":125},[129,286,287,289,291,293,295,297],{"class":131,"line":132},[129,288,136],{"class":135},[129,290,140],{"class":139},[129,292,144],{"class":143},[129,294,148],{"class":147},[129,296,151],{"class":143},[129,298,154],{"class":143},[129,300,301,303,305,307,310,313,315,317,319,322,325,328,330,332,335,337,340,342,344,346],{"class":131,"line":157},[129,302,161],{"class":160},[129,304,165],{"class":164},[129,306,25],{"class":143},[129,308,309],{"class":139},"trim",[129,311,312],{"class":173},"()",[129,314,25],{"class":143},[129,316,170],{"class":139},[129,318,144],{"class":173},[129,320,321],{"class":143},"\u002F",[129,323,324],{"class":215},"\\s",[129,326,327],{"class":143},"+\u002F",[129,329,151],{"class":173},[129,331,25],{"class":143},[129,333,334],{"class":139},"filter",[129,336,144],{"class":173},[129,338,339],{"class":164},"Boolean",[129,341,151],{"class":173},[129,343,25],{"class":143},[129,345,186],{"class":164},[129,347,189],{"class":143},[129,349,350],{"class":131,"line":192},[129,351,195],{"class":143},[12,353,354,355,358,359,362,363,366],{},"Much better. ",[15,356,357],{},"\u002F\\s+\u002F"," matches any sequence of whitespace — spaces, tabs, newlines, carriage returns. ",[15,360,361],{},"trim()"," handles leading\u002Ftrailing whitespace. ",[15,364,365],{},"filter(Boolean)"," drops empty strings.",[120,368,370],{"className":122,"code":369,"language":124,"meta":125,"style":125},"countWords('hello  world')     \u002F\u002F → 2 ✓\ncountWords('hello\\tworld')     \u002F\u002F → 2 ✓\ncountWords('')                 \u002F\u002F → 0 ✓\ncountWords('héllo wörld')      \u002F\u002F → 2 ✓ (accent characters preserved)\n",[15,371,372,390,410,424],{"__ignoreMap":125},[129,373,374,376,378,380,382,384,387],{"class":131,"line":132},[129,375,208],{"class":139},[129,377,144],{"class":164},[129,379,176],{"class":143},[129,381,50],{"class":215},[129,383,176],{"class":143},[129,385,386],{"class":164},")     ",[129,388,389],{"class":223},"\u002F\u002F → 2 ✓\n",[129,391,392,394,396,398,400,402,404,406,408],{"class":131,"line":157},[129,393,208],{"class":139},[129,395,144],{"class":164},[129,397,176],{"class":143},[129,399,235],{"class":215},[129,401,238],{"class":164},[129,403,241],{"class":215},[129,405,176],{"class":143},[129,407,386],{"class":164},[129,409,389],{"class":223},[129,411,412,414,416,418,421],{"class":131,"line":192},[129,413,208],{"class":139},[129,415,144],{"class":164},[129,417,257],{"class":143},[129,419,420],{"class":164},")                 ",[129,422,423],{"class":223},"\u002F\u002F → 0 ✓\n",[129,425,427,429,431,433,436,438,441],{"class":131,"line":426},4,[129,428,208],{"class":139},[129,430,144],{"class":164},[129,432,176],{"class":143},[129,434,435],{"class":215},"héllo wörld",[129,437,176],{"class":143},[129,439,440],{"class":164},")      ",[129,442,443],{"class":223},"\u002F\u002F → 2 ✓ (accent characters preserved)\n",[12,445,446,449,450,453,454,457,458,461],{},[44,447,448],{},"Where it breaks:"," CJK text. ",[15,451,452],{},"'你好世界'.trim().split(\u002F\\s+\u002F)"," returns ",[15,455,456],{},"['你好世界']"," — one token, not four words. Also counts punctuation-only tokens: if your input has ",[15,459,460],{},"-- --",", you get 2 phantom \"words.\"",[12,463,464,466],{},[44,465,268],{}," Fine for English-only tools. Broken for global audiences.",[27,468],{},[112,470,472,473,476],{"id":471},"method-3-textmatchbwbg-length-the-classic-regex","Method 3: ",[15,474,475],{},"(text.match(\u002F\\b\\w+\\b\u002Fg) || []).length"," — The Classic Regex",[120,478,480],{"className":122,"code":479,"language":124,"meta":125,"style":125},"function countWords(text) {\n  return (text.match(\u002F\\b\\w+\\b\u002Fg) || []).length;\n}\n",[15,481,482,496,545],{"__ignoreMap":125},[129,483,484,486,488,490,492,494],{"class":131,"line":132},[129,485,136],{"class":135},[129,487,140],{"class":139},[129,489,144],{"class":143},[129,491,148],{"class":147},[129,493,151],{"class":143},[129,495,154],{"class":143},[129,497,498,500,502,504,506,509,511,513,516,519,522,524,526,530,533,536,539,541,543],{"class":131,"line":157},[129,499,161],{"class":160},[129,501,57],{"class":173},[129,503,148],{"class":164},[129,505,25],{"class":143},[129,507,508],{"class":139},"match",[129,510,144],{"class":173},[129,512,321],{"class":143},[129,514,515],{"class":160},"\\b",[129,517,518],{"class":215},"\\w",[129,520,521],{"class":143},"+",[129,523,515],{"class":160},[129,525,321],{"class":143},[129,527,529],{"class":528},"sbssI","g",[129,531,532],{"class":173},") ",[129,534,535],{"class":143},"||",[129,537,538],{"class":173}," [])",[129,540,25],{"class":143},[129,542,186],{"class":164},[129,544,189],{"class":143},[129,546,547],{"class":131,"line":192},[129,548,195],{"class":143},[12,550,551,552,25],{},"You'll see this everywhere on Stack Overflow. The problem is ",[15,553,518],{},[12,555,556,557,559,560,563,564,567,568,571,572,575,576,579,580,583,584,453,587,590],{},"In JavaScript, ",[15,558,518],{}," is ",[15,561,562],{},"[A-Za-z0-9_]",". That's the full set. Every character outside ASCII — Cyrillic (",[15,565,566],{},"привет","), Arabic (",[15,569,570],{},"مرحبا","), Greek (",[15,573,574],{},"γεια","), Korean (",[15,577,578],{},"안녕",") — is invisible to this regex. The ",[15,581,582],{},"|| []"," fallback is the tell: without it, ",[15,585,586],{},".match()",[15,588,589],{},"null"," on no matches, which would be the entire string for non-Latin text.",[120,592,594],{"className":122,"code":593,"language":124,"meta":125,"style":125},"countWords('hello world')   \u002F\u002F → 2 ✓\ncountWords('привет мир')    \u002F\u002F → 0 ✗ (Cyrillic not matched)\ncountWords('héllo')         \u002F\u002F → 1, but counts 'h llo' internally → actually still 1 but accented chars may be excluded\n",[15,595,596,614,633],{"__ignoreMap":125},[129,597,598,600,602,604,607,609,612],{"class":131,"line":132},[129,599,208],{"class":139},[129,601,144],{"class":164},[129,603,176],{"class":143},[129,605,606],{"class":215},"hello world",[129,608,176],{"class":143},[129,610,611],{"class":164},")   ",[129,613,389],{"class":223},[129,615,616,618,620,622,625,627,630],{"class":131,"line":157},[129,617,208],{"class":139},[129,619,144],{"class":164},[129,621,176],{"class":143},[129,623,624],{"class":215},"привет мир",[129,626,176],{"class":143},[129,628,629],{"class":164},")    ",[129,631,632],{"class":223},"\u002F\u002F → 0 ✗ (Cyrillic not matched)\n",[129,634,635,637,639,641,644,646,649],{"class":131,"line":192},[129,636,208],{"class":139},[129,638,144],{"class":164},[129,640,176],{"class":143},[129,642,643],{"class":215},"héllo",[129,645,176],{"class":143},[129,647,648],{"class":164},")         ",[129,650,651],{"class":223},"\u002F\u002F → 1, but counts 'h llo' internally → actually still 1 but accented chars may be excluded\n",[12,653,654,656],{},[44,655,268],{}," Acceptable for dev tools that only process ASCII. Silent fail for everything else.",[27,658],{},[112,660,662,663,666],{"id":661},"method-4-textmatchplgu-length-unicode-property-escapes","Method 4: ",[15,664,665],{},"(text.match(\u002F\\p{L}+\u002Fgu) || []).length"," — Unicode Property Escapes",[120,668,670],{"className":122,"code":669,"language":124,"meta":125,"style":125},"function countWords(text) {\n  return (text.match(\u002F\\p{L}+\u002Fgu) || []).length;\n}\n",[15,671,672,686,725],{"__ignoreMap":125},[129,673,674,676,678,680,682,684],{"class":131,"line":132},[129,675,136],{"class":135},[129,677,140],{"class":139},[129,679,144],{"class":143},[129,681,148],{"class":147},[129,683,151],{"class":143},[129,685,154],{"class":143},[129,687,688,690,692,694,696,698,700,702,705,708,710,713,715,717,719,721,723],{"class":131,"line":157},[129,689,161],{"class":160},[129,691,57],{"class":173},[129,693,148],{"class":164},[129,695,25],{"class":143},[129,697,508],{"class":139},[129,699,144],{"class":173},[129,701,321],{"class":143},[129,703,704],{"class":164},"\\p",[129,706,707],{"class":215},"{L}",[129,709,327],{"class":143},[129,711,712],{"class":528},"gu",[129,714,532],{"class":173},[129,716,535],{"class":143},[129,718,538],{"class":173},[129,720,25],{"class":143},[129,722,186],{"class":164},[129,724,189],{"class":143},[129,726,727],{"class":131,"line":192},[129,728,195],{"class":143},[12,730,731,734,735,738,739,742,743,745,746,748],{},[15,732,733],{},"\\p{L}"," is a Unicode Property Escape meaning \"any Unicode letter.\" The ",[15,736,737],{},"u"," flag is required — without it, V8 throws a ",[15,740,741],{},"SyntaxError"," because ",[15,744,704],{}," isn't valid in non-Unicode mode. The ",[15,747,529],{}," flag finds all matches globally.",[120,750,752],{"className":122,"code":751,"language":124,"meta":125,"style":125},"countWords('hello world')      \u002F\u002F → 2 ✓\ncountWords('привет мир')       \u002F\u002F → 2 ✓ (Cyrillic works)\ncountWords('héllo wörld')      \u002F\u002F → 2 ✓ (accented chars work)\ncountWords('你好 世界')         \u002F\u002F → 2 ✓ (space-separated CJK)\ncountWords('你好世界')          \u002F\u002F → 1 ✗ (no spaces, counts as one match)\n",[15,753,754,770,788,805,823],{"__ignoreMap":125},[129,755,756,758,760,762,764,766,768],{"class":131,"line":132},[129,757,208],{"class":139},[129,759,144],{"class":164},[129,761,176],{"class":143},[129,763,606],{"class":215},[129,765,176],{"class":143},[129,767,440],{"class":164},[129,769,389],{"class":223},[129,771,772,774,776,778,780,782,785],{"class":131,"line":157},[129,773,208],{"class":139},[129,775,144],{"class":164},[129,777,176],{"class":143},[129,779,624],{"class":215},[129,781,176],{"class":143},[129,783,784],{"class":164},")       ",[129,786,787],{"class":223},"\u002F\u002F → 2 ✓ (Cyrillic works)\n",[129,789,790,792,794,796,798,800,802],{"class":131,"line":192},[129,791,208],{"class":139},[129,793,144],{"class":164},[129,795,176],{"class":143},[129,797,435],{"class":215},[129,799,176],{"class":143},[129,801,440],{"class":164},[129,803,804],{"class":223},"\u002F\u002F → 2 ✓ (accented chars work)\n",[129,806,807,809,811,813,816,818,820],{"class":131,"line":426},[129,808,208],{"class":139},[129,810,144],{"class":164},[129,812,176],{"class":143},[129,814,815],{"class":215},"你好 世界",[129,817,176],{"class":143},[129,819,648],{"class":164},[129,821,822],{"class":223},"\u002F\u002F → 2 ✓ (space-separated CJK)\n",[129,824,826,828,830,832,835,837,840],{"class":131,"line":825},5,[129,827,208],{"class":139},[129,829,144],{"class":164},[129,831,176],{"class":143},[129,833,834],{"class":215},"你好世界",[129,836,176],{"class":143},[129,838,839],{"class":164},")          ",[129,841,842],{"class":223},"\u002F\u002F → 1 ✗ (no spaces, counts as one match)\n",[12,844,845],{},"Numbers and standalone punctuation are automatically excluded, which is usually what you want.",[12,847,848,850],{},[44,849,268],{}," Excellent for Latin, Cyrillic, Arabic, Greek, Hebrew, and space-separated CJK. Still can't segment CJK without whitespace.",[27,852],{},[112,854,856,857,859],{"id":855},"method-5-intlsegmenter-the-right-answer","Method 5: ",[15,858,24],{}," — The Right Answer",[120,861,863],{"className":122,"code":862,"language":124,"meta":125,"style":125},"function countWords(text) {\n  const segmenter = new Intl.Segmenter('und', { granularity: 'word' });\n  let count = 0;\n  for (const { isWordLike } of segmenter.segment(text)) {\n    if (isWordLike) count++;\n  }\n  return count;\n}\n",[15,864,865,879,936,951,988,1006,1012,1021],{"__ignoreMap":125},[129,866,867,869,871,873,875,877],{"class":131,"line":132},[129,868,136],{"class":135},[129,870,140],{"class":139},[129,872,144],{"class":143},[129,874,148],{"class":147},[129,876,151],{"class":143},[129,878,154],{"class":143},[129,880,881,884,887,890,893,896,898,901,903,905,908,910,913,916,919,922,924,927,929,932,934],{"class":131,"line":157},[129,882,883],{"class":135},"  const",[129,885,886],{"class":164}," segmenter",[129,888,889],{"class":143}," =",[129,891,892],{"class":143}," new",[129,894,895],{"class":164}," Intl",[129,897,25],{"class":143},[129,899,900],{"class":139},"Segmenter",[129,902,144],{"class":173},[129,904,176],{"class":143},[129,906,907],{"class":215},"und",[129,909,176],{"class":143},[129,911,912],{"class":143},",",[129,914,915],{"class":143}," {",[129,917,918],{"class":173}," granularity",[129,920,921],{"class":143},":",[129,923,179],{"class":143},[129,925,926],{"class":215},"word",[129,928,176],{"class":143},[129,930,931],{"class":143}," }",[129,933,151],{"class":173},[129,935,189],{"class":143},[129,937,938,941,944,946,949],{"class":131,"line":192},[129,939,940],{"class":135},"  let",[129,942,943],{"class":164}," count",[129,945,889],{"class":143},[129,947,948],{"class":528}," 0",[129,950,189],{"class":143},[129,952,953,956,958,961,963,966,968,971,973,975,978,980,982,985],{"class":131,"line":426},[129,954,955],{"class":160},"  for",[129,957,57],{"class":173},[129,959,960],{"class":135},"const",[129,962,915],{"class":143},[129,964,965],{"class":164}," isWordLike",[129,967,931],{"class":143},[129,969,970],{"class":143}," of",[129,972,886],{"class":164},[129,974,25],{"class":143},[129,976,977],{"class":139},"segment",[129,979,144],{"class":173},[129,981,148],{"class":164},[129,983,984],{"class":173},")) ",[129,986,987],{"class":143},"{\n",[129,989,990,993,995,998,1000,1003],{"class":131,"line":825},[129,991,992],{"class":160},"    if",[129,994,57],{"class":173},[129,996,997],{"class":164},"isWordLike",[129,999,532],{"class":173},[129,1001,1002],{"class":164},"count",[129,1004,1005],{"class":143},"++;\n",[129,1007,1009],{"class":131,"line":1008},6,[129,1010,1011],{"class":143},"  }\n",[129,1013,1015,1017,1019],{"class":131,"line":1014},7,[129,1016,161],{"class":160},[129,1018,943],{"class":164},[129,1020,189],{"class":143},[129,1022,1024],{"class":131,"line":1023},8,[129,1025,195],{"class":143},[12,1027,1028,1030,1031,1034,1035,97,1038,1041],{},[15,1029,24],{}," is a W3C Internationalization API available in all modern JavaScript runtimes (Baseline 2023). Pass ",[15,1032,1033],{},"'und'"," as the locale for locale-independent segmentation, or your specific locale (",[15,1036,1037],{},"'zh'",[15,1039,1040],{},"'ja'",") for language-aware rules.",[12,1043,1044,1045,1047,1048,1051,1052,1055],{},"The ",[15,1046,997],{}," flag is the key — it's ",[15,1049,1050],{},"true"," for actual words and ",[15,1053,1054],{},"false"," for spaces, punctuation, and separators. No filtering needed.",[120,1057,1059],{"className":122,"code":1058,"language":124,"meta":125,"style":125},"countWords('hello world')      \u002F\u002F → 2 ✓\ncountWords('привет мир')       \u002F\u002F → 2 ✓\ncountWords('你好世界')          \u002F\u002F → 2 ✓ (你好 = hello, 世界 = world — dictionary segmentation)\ncountWords(\"don't stop\")       \u002F\u002F → 2 ✓ (contraction = 1 word)\ncountWords('state-of-the-art') \u002F\u002F → 4 ✓ (hyphenated = 4 words, matches editorial convention)\ncountWords('')                 \u002F\u002F → 0 ✓\n",[15,1060,1061,1077,1093,1110,1129,1146],{"__ignoreMap":125},[129,1062,1063,1065,1067,1069,1071,1073,1075],{"class":131,"line":132},[129,1064,208],{"class":139},[129,1066,144],{"class":164},[129,1068,176],{"class":143},[129,1070,606],{"class":215},[129,1072,176],{"class":143},[129,1074,440],{"class":164},[129,1076,389],{"class":223},[129,1078,1079,1081,1083,1085,1087,1089,1091],{"class":131,"line":157},[129,1080,208],{"class":139},[129,1082,144],{"class":164},[129,1084,176],{"class":143},[129,1086,624],{"class":215},[129,1088,176],{"class":143},[129,1090,784],{"class":164},[129,1092,389],{"class":223},[129,1094,1095,1097,1099,1101,1103,1105,1107],{"class":131,"line":192},[129,1096,208],{"class":139},[129,1098,144],{"class":164},[129,1100,176],{"class":143},[129,1102,834],{"class":215},[129,1104,176],{"class":143},[129,1106,839],{"class":164},[129,1108,1109],{"class":223},"\u002F\u002F → 2 ✓ (你好 = hello, 世界 = world — dictionary segmentation)\n",[129,1111,1112,1114,1116,1119,1122,1124,1126],{"class":131,"line":426},[129,1113,208],{"class":139},[129,1115,144],{"class":164},[129,1117,1118],{"class":143},"\"",[129,1120,1121],{"class":215},"don't stop",[129,1123,1118],{"class":143},[129,1125,784],{"class":164},[129,1127,1128],{"class":223},"\u002F\u002F → 2 ✓ (contraction = 1 word)\n",[129,1130,1131,1133,1135,1137,1139,1141,1143],{"class":131,"line":825},[129,1132,208],{"class":139},[129,1134,144],{"class":164},[129,1136,176],{"class":143},[129,1138,100],{"class":215},[129,1140,176],{"class":143},[129,1142,532],{"class":164},[129,1144,1145],{"class":223},"\u002F\u002F → 4 ✓ (hyphenated = 4 words, matches editorial convention)\n",[129,1147,1148,1150,1152,1154,1156],{"class":131,"line":1008},[129,1149,208],{"class":139},[129,1151,144],{"class":164},[129,1153,257],{"class":143},[129,1155,420],{"class":164},[129,1157,423],{"class":223},[12,1159,1160,1162],{},[44,1161,268],{}," Use this. It's what browsers use internally for spell-check and text selection.",[1164,1165,1166],"blockquote",{},[12,1167,1168,1171,1172,1175,1176,1178,1179,1182,1183,1186,1187,1195,1196,1199],{},[44,1169,1170],{},"Node.js note:"," On Node.js 16+ with the default ",[15,1173,1174],{},"full-icu"," build, ",[15,1177,24],{}," works out of the box. If you're on an older version or a ",[15,1180,1181],{},"small-icu"," build (common in some Docker images), you may get ",[15,1184,1185],{},"TypeError: Intl.Segmenter is not a constructor",". Fix it by installing the ",[1188,1189,1193],"a",{"href":1190,"rel":1191},"https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Ffull-icu",[1192],"nofollow",[15,1194,1174],{}," package and passing ",[15,1197,1198],{},"--icu-data-dir"," at startup — or just upgrade to Node 18+, where full ICU data is bundled by default.",[27,1201],{},[30,1203,1205],{"id":1204},"accuracy-comparison-table","Accuracy Comparison Table",[1207,1208,1209,1237],"table",{},[1210,1211,1212],"thead",{},[1213,1214,1215,1219,1222,1225,1228,1231,1234],"tr",{},[1216,1217,1218],"th",{},"Method",[1216,1220,1221],{},"English",[1216,1223,1224],{},"Accents",[1216,1226,1227],{},"Cyrillic\u002FArabic",[1216,1229,1230],{},"CJK (no spaces)",[1216,1232,1233],{},"Empty string",[1216,1235,1236],{},"Contraction",[1238,1239,1240,1262,1282,1301,1320],"tbody",{},[1213,1241,1242,1247,1250,1253,1255,1257,1260],{},[1243,1244,1245],"td",{},[15,1246,70],{},[1243,1248,1249],{},"✗ (double spaces)",[1243,1251,1252],{},"✓",[1243,1254,1252],{},[1243,1256,1252],{},[1243,1258,1259],{},"✗ (returns 1)",[1243,1261,1252],{},[1213,1263,1264,1269,1271,1273,1275,1278,1280],{},[1243,1265,1266],{},[15,1267,1268],{},"trim().split(\u002F\\s+\u002F)",[1243,1270,1252],{},[1243,1272,1252],{},[1243,1274,1252],{},[1243,1276,1277],{},"✗",[1243,1279,1252],{},[1243,1281,1252],{},[1213,1283,1284,1289,1291,1293,1295,1297,1299],{},[1243,1285,1286],{},[15,1287,1288],{},"\u002F\\b\\w+\\b\u002Fg",[1243,1290,1252],{},[1243,1292,1277],{},[1243,1294,1277],{},[1243,1296,1277],{},[1243,1298,1252],{},[1243,1300,1252],{},[1213,1302,1303,1308,1310,1312,1314,1316,1318],{},[1243,1304,1305],{},[15,1306,1307],{},"\u002F\\p{L}+\u002Fgu",[1243,1309,1252],{},[1243,1311,1252],{},[1243,1313,1252],{},[1243,1315,1277],{},[1243,1317,1252],{},[1243,1319,1252],{},[1213,1321,1322,1326,1328,1330,1332,1334,1336],{},[1243,1323,1324],{},[15,1325,24],{},[1243,1327,1252],{},[1243,1329,1252],{},[1243,1331,1252],{},[1243,1333,1252],{},[1243,1335,1252],{},[1243,1337,1252],{},[12,1339,1044,1340,1342],{},[15,1341,24],{}," row is the only one with all checks.",[12,1344,1345],{},[1346,1347],"img",{"alt":1348,"src":1349},"The same sentence rendered in Latin, Cyrillic, Arabic, CJK, and emoji scripts — green checkmark over Intl.Segmenter, red X over \\w regex","\u002Farticles\u002Fhow-to-count-words-javascript\u002Fsection-unicode.webp",[27,1351],{},[30,1353,1355],{"id":1354},"performance-considerations","Performance Considerations",[12,1357,1358],{},"For a 1,000-word document, all five methods are negligible — under 1ms on any modern machine. The difference shows at scale.",[12,1360,1361],{},"At 100,000 words (a full novel manuscript):",[38,1363,1364,1374],{},[41,1365,1366,1367,1369,1370,1373],{},"Regex methods (",[15,1368,1307],{},") run in ~20–40ms — fast enough for real-time counting on ",[15,1371,1372],{},"input"," events",[41,1375,1376,1378],{},[15,1377,24],{}," runs in ~80–120ms — still sub-100ms, but getting close to the threshold for smooth 60fps UI",[12,1380,1381,1384,1385,1388],{},[44,1382,1383],{},"Rule of thumb:"," For inputs above 50,000 words, run the counter in a Web Worker. Pass the text via ",[15,1386,1387],{},"postMessage",", run the segmenter in the worker context, and post the result back. The main thread stays unblocked.",[120,1390,1392],{"className":122,"code":1391,"language":124,"meta":125,"style":125},"\u002F\u002F word-count.worker.js\nself.onmessage = ({ data: text }) => {\n  const segmenter = new Intl.Segmenter('und', { granularity: 'word' });\n  let count = 0;\n  for (const { isWordLike } of segmenter.segment(text)) {\n    if (isWordLike) count++;\n  }\n  self.postMessage(count);\n};\n",[15,1393,1394,1399,1429,1473,1485,1515,1529,1533,1550],{"__ignoreMap":125},[129,1395,1396],{"class":131,"line":132},[129,1397,1398],{"class":223},"\u002F\u002F word-count.worker.js\n",[129,1400,1401,1404,1406,1409,1411,1414,1417,1419,1421,1424,1427],{"class":131,"line":157},[129,1402,1403],{"class":164},"self",[129,1405,25],{"class":143},[129,1407,1408],{"class":139},"onmessage",[129,1410,889],{"class":143},[129,1412,1413],{"class":143}," ({",[129,1415,1416],{"class":173}," data",[129,1418,921],{"class":143},[129,1420,165],{"class":147},[129,1422,1423],{"class":143}," })",[129,1425,1426],{"class":135}," =>",[129,1428,154],{"class":143},[129,1430,1431,1433,1435,1437,1439,1441,1443,1445,1447,1449,1451,1453,1455,1457,1459,1461,1463,1465,1467,1469,1471],{"class":131,"line":192},[129,1432,883],{"class":135},[129,1434,886],{"class":164},[129,1436,889],{"class":143},[129,1438,892],{"class":143},[129,1440,895],{"class":164},[129,1442,25],{"class":143},[129,1444,900],{"class":139},[129,1446,144],{"class":173},[129,1448,176],{"class":143},[129,1450,907],{"class":215},[129,1452,176],{"class":143},[129,1454,912],{"class":143},[129,1456,915],{"class":143},[129,1458,918],{"class":173},[129,1460,921],{"class":143},[129,1462,179],{"class":143},[129,1464,926],{"class":215},[129,1466,176],{"class":143},[129,1468,931],{"class":143},[129,1470,151],{"class":173},[129,1472,189],{"class":143},[129,1474,1475,1477,1479,1481,1483],{"class":131,"line":426},[129,1476,940],{"class":135},[129,1478,943],{"class":164},[129,1480,889],{"class":143},[129,1482,948],{"class":528},[129,1484,189],{"class":143},[129,1486,1487,1489,1491,1493,1495,1497,1499,1501,1503,1505,1507,1509,1511,1513],{"class":131,"line":825},[129,1488,955],{"class":160},[129,1490,57],{"class":173},[129,1492,960],{"class":135},[129,1494,915],{"class":143},[129,1496,965],{"class":164},[129,1498,931],{"class":143},[129,1500,970],{"class":143},[129,1502,886],{"class":164},[129,1504,25],{"class":143},[129,1506,977],{"class":139},[129,1508,144],{"class":173},[129,1510,148],{"class":164},[129,1512,984],{"class":173},[129,1514,987],{"class":143},[129,1516,1517,1519,1521,1523,1525,1527],{"class":131,"line":1008},[129,1518,992],{"class":160},[129,1520,57],{"class":173},[129,1522,997],{"class":164},[129,1524,532],{"class":173},[129,1526,1002],{"class":164},[129,1528,1005],{"class":143},[129,1530,1531],{"class":131,"line":1014},[129,1532,1011],{"class":143},[129,1534,1535,1538,1540,1542,1544,1546,1548],{"class":131,"line":1023},[129,1536,1537],{"class":164},"  self",[129,1539,25],{"class":143},[129,1541,1387],{"class":139},[129,1543,144],{"class":173},[129,1545,1002],{"class":164},[129,1547,151],{"class":173},[129,1549,189],{"class":143},[129,1551,1553],{"class":131,"line":1552},9,[129,1554,1555],{"class":143},"};\n",[12,1557,1558,1559,1564],{},"If you want to verify your implementation against a reference implementation, paste your text into our ",[44,1560,1561],{},[1188,1562,1563],{"href":321},"Word Counter"," — runs 100% in your browser, zero data sent to any server — and compare the count you get from your function against what it reports.",[27,1566],{},[30,1568,1570],{"id":1569},"when-to-use-each-method","When to Use Each Method",[1207,1572,1573,1583],{},[1210,1574,1575],{},[1213,1576,1577,1580],{},[1216,1578,1579],{},"Use case",[1216,1581,1582],{},"Recommended method",[1238,1584,1585,1594,1603,1612,1622],{},[1213,1586,1587,1590],{},[1243,1588,1589],{},"Quick English-only script",[1243,1591,1592],{},[15,1593,1268],{},[1213,1595,1596,1599],{},[1243,1597,1598],{},"Production app, multi-language",[1243,1600,1601],{},[15,1602,1307],{},[1213,1604,1605,1608],{},[1243,1606,1607],{},"Production app + CJK support",[1243,1609,1610],{},[15,1611,24],{},[1213,1613,1614,1617],{},[1243,1615,1616],{},"Node.js CLI, any language",[1243,1618,1619,1621],{},[15,1620,24],{}," (Node ≥16)",[1213,1623,1624,1627],{},[1243,1625,1626],{},"Legacy browsers (IE, old Safari)",[1243,1628,1629,1631],{},[15,1630,1268],{}," + polyfill note",[27,1633],{},[30,1635,1637],{"id":1636},"real-world-edge-cases-to-test","Real-World Edge Cases to Test",[12,1639,1640],{},"Before shipping a word counter, run it against these inputs. If any of them produce unexpected results, your method has a bug:",[120,1642,1644],{"className":122,"code":1643,"language":124,"meta":125,"style":125},"\u002F\u002F 1. Multiple whitespace types\n\"hello\\t\\nworld\"         \u002F\u002F expect: 2\n\n\u002F\u002F 2. Non-breaking space (pasted from Word)\n\"hello world\"       \u002F\u002F expect: 2\n\n\u002F\u002F 3. Zero-width space (pasted from web)\n\"hello​world\"       \u002F\u002F expect: 1 or 2 (debatable, document your choice)\n\n\u002F\u002F 4. Pure punctuation\n\"... --- ???\"            \u002F\u002F expect: 0\n\n\u002F\u002F 5. Numbers only\n\"123 456\"                \u002F\u002F expect: 0 (if counting \"words\" = letters only)\n                         \u002F\u002F expect: 2 (if counting tokens)\n\n\u002F\u002F 6. Mixed script\n\"hello мир\"              \u002F\u002F expect: 2\n\n\u002F\u002F 7. Emoji in text\n\"Great job 🎉\"           \u002F\u002F expect: 2 (emoji is not a word)\n\n\u002F\u002F 8. Hyphenated compound\n\"state-of-the-art design\" \u002F\u002F Intl.Segmenter → 5; \u002F\\p{L}+\u002Fgu → 5; split → 2\n",[15,1645,1646,1651,1667,1673,1678,1690,1694,1699,1711,1715,1721,1734,1739,1745,1758,1764,1769,1775,1788,1793,1799,1812,1817,1823],{"__ignoreMap":125},[129,1647,1648],{"class":131,"line":132},[129,1649,1650],{"class":223},"\u002F\u002F 1. Multiple whitespace types\n",[129,1652,1653,1655,1657,1660,1662,1664],{"class":131,"line":157},[129,1654,1118],{"class":143},[129,1656,235],{"class":215},[129,1658,1659],{"class":164},"\\t\\n",[129,1661,241],{"class":215},[129,1663,1118],{"class":143},[129,1665,1666],{"class":223},"         \u002F\u002F expect: 2\n",[129,1668,1669],{"class":131,"line":192},[129,1670,1672],{"emptyLinePlaceholder":1671},true,"\n",[129,1674,1675],{"class":131,"line":426},[129,1676,1677],{"class":223},"\u002F\u002F 2. Non-breaking space (pasted from Word)\n",[129,1679,1680,1682,1685,1687],{"class":131,"line":825},[129,1681,1118],{"class":143},[129,1683,1684],{"class":215},"hello world",[129,1686,1118],{"class":143},[129,1688,1689],{"class":223},"       \u002F\u002F expect: 2\n",[129,1691,1692],{"class":131,"line":1008},[129,1693,1672],{"emptyLinePlaceholder":1671},[129,1695,1696],{"class":131,"line":1014},[129,1697,1698],{"class":223},"\u002F\u002F 3. Zero-width space (pasted from web)\n",[129,1700,1701,1703,1706,1708],{"class":131,"line":1023},[129,1702,1118],{"class":143},[129,1704,1705],{"class":215},"hello​world",[129,1707,1118],{"class":143},[129,1709,1710],{"class":223},"       \u002F\u002F expect: 1 or 2 (debatable, document your choice)\n",[129,1712,1713],{"class":131,"line":1552},[129,1714,1672],{"emptyLinePlaceholder":1671},[129,1716,1718],{"class":131,"line":1717},10,[129,1719,1720],{"class":223},"\u002F\u002F 4. Pure punctuation\n",[129,1722,1724,1726,1729,1731],{"class":131,"line":1723},11,[129,1725,1118],{"class":143},[129,1727,1728],{"class":215},"... --- ???",[129,1730,1118],{"class":143},[129,1732,1733],{"class":223},"            \u002F\u002F expect: 0\n",[129,1735,1737],{"class":131,"line":1736},12,[129,1738,1672],{"emptyLinePlaceholder":1671},[129,1740,1742],{"class":131,"line":1741},13,[129,1743,1744],{"class":223},"\u002F\u002F 5. Numbers only\n",[129,1746,1748,1750,1753,1755],{"class":131,"line":1747},14,[129,1749,1118],{"class":143},[129,1751,1752],{"class":215},"123 456",[129,1754,1118],{"class":143},[129,1756,1757],{"class":223},"                \u002F\u002F expect: 0 (if counting \"words\" = letters only)\n",[129,1759,1761],{"class":131,"line":1760},15,[129,1762,1763],{"class":223},"                         \u002F\u002F expect: 2 (if counting tokens)\n",[129,1765,1767],{"class":131,"line":1766},16,[129,1768,1672],{"emptyLinePlaceholder":1671},[129,1770,1772],{"class":131,"line":1771},17,[129,1773,1774],{"class":223},"\u002F\u002F 6. Mixed script\n",[129,1776,1778,1780,1783,1785],{"class":131,"line":1777},18,[129,1779,1118],{"class":143},[129,1781,1782],{"class":215},"hello мир",[129,1784,1118],{"class":143},[129,1786,1787],{"class":223},"              \u002F\u002F expect: 2\n",[129,1789,1791],{"class":131,"line":1790},19,[129,1792,1672],{"emptyLinePlaceholder":1671},[129,1794,1796],{"class":131,"line":1795},20,[129,1797,1798],{"class":223},"\u002F\u002F 7. Emoji in text\n",[129,1800,1802,1804,1807,1809],{"class":131,"line":1801},21,[129,1803,1118],{"class":143},[129,1805,1806],{"class":215},"Great job 🎉",[129,1808,1118],{"class":143},[129,1810,1811],{"class":223},"           \u002F\u002F expect: 2 (emoji is not a word)\n",[129,1813,1815],{"class":131,"line":1814},22,[129,1816,1672],{"emptyLinePlaceholder":1671},[129,1818,1820],{"class":131,"line":1819},23,[129,1821,1822],{"class":223},"\u002F\u002F 8. Hyphenated compound\n",[129,1824,1826,1828,1831,1833],{"class":131,"line":1825},24,[129,1827,1118],{"class":143},[129,1829,1830],{"class":215},"state-of-the-art design",[129,1832,1118],{"class":143},[129,1834,1835],{"class":223}," \u002F\u002F Intl.Segmenter → 5; \u002F\\p{L}+\u002Fgu → 5; split → 2\n",[12,1837,1838],{},"The hyphenated case is the one that trips people up the most. There's no universal \"right\" answer — English style guides disagree. Pick a behavior, document it, and be consistent.",[27,1840],{},[30,1842,1844],{"id":1843},"what-the-word-counter-on-this-site-uses","What the Word Counter on This Site Uses",[12,1846,1044,1847,1849,1850,1852,1853,1855],{},[1188,1848,1563],{"href":321}," on editlyapp.com uses ",[15,1851,24],{}," with ",[15,1854,1307],{}," as a fallback for environments where the API isn't yet available. The segmenter runs on the main thread for documents under 50,000 words and kicks to a Web Worker for larger inputs — keeping UI response under 16ms regardless of manuscript length.",[12,1857,1858,1859,1863,1864,1852,1866,1869],{},"This is the same approach described in the ",[1188,1860,1862],{"href":1861},"\u002Fblog\u002Freadability-score-explained","Readability Score Explained"," article, where sentence segmentation uses ",[15,1865,24],{},[15,1867,1868],{},"granularity: 'sentence'"," to feed the Flesch-Kincaid formula accurately.",[12,1871,1872,1873,1877,1878,1880,1881,1883],{},"If you're building a regex-based text tool and need to test patterns against real content, the ",[1188,1874,1876],{"href":1875},"\u002Ffind-replace","Find & Replace"," tool on this site supports full regex with the ",[15,1879,737],{}," flag — useful for validating your ",[15,1882,1307],{}," patterns before wiring them into your app.",[27,1885],{},[30,1887,1889],{"id":1888},"the-definitive-implementation","The Definitive Implementation",[12,1891,1892],{},"Here's the production-ready version that handles every case above:",[120,1894,1896],{"className":122,"code":1895,"language":124,"meta":125,"style":125},"\u002F**\n * Count words in any language using Intl.Segmenter.\n * Falls back to Unicode regex for environments without Segmenter support.\n *\u002F\nfunction countWords(text) {\n  if (!text || !text.trim()) return 0;\n\n  if (typeof Intl !== 'undefined' && Intl.Segmenter) {\n    const segmenter = new Intl.Segmenter('und', { granularity: 'word' });\n    let count = 0;\n    for (const { isWordLike } of segmenter.segment(text)) {\n      if (isWordLike) count++;\n    }\n    return count;\n  }\n\n  \u002F\u002F Fallback: Unicode Property Escapes (all modern browsers, no IE)\n  return (text.match(\u002F\\p{L}+\u002Fgu) || []).length;\n}\n",[15,1897,1898,1903,1908,1913,1918,1932,1966,1970,2004,2049,2062,2093,2108,2113,2122,2126,2130,2135,2171],{"__ignoreMap":125},[129,1899,1900],{"class":131,"line":132},[129,1901,1902],{"class":223},"\u002F**\n",[129,1904,1905],{"class":131,"line":157},[129,1906,1907],{"class":223}," * Count words in any language using Intl.Segmenter.\n",[129,1909,1910],{"class":131,"line":192},[129,1911,1912],{"class":223}," * Falls back to Unicode regex for environments without Segmenter support.\n",[129,1914,1915],{"class":131,"line":426},[129,1916,1917],{"class":223}," *\u002F\n",[129,1919,1920,1922,1924,1926,1928,1930],{"class":131,"line":825},[129,1921,136],{"class":135},[129,1923,140],{"class":139},[129,1925,144],{"class":143},[129,1927,148],{"class":147},[129,1929,151],{"class":143},[129,1931,154],{"class":143},[129,1933,1934,1937,1939,1942,1944,1947,1950,1952,1954,1956,1959,1962,1964],{"class":131,"line":1008},[129,1935,1936],{"class":160},"  if",[129,1938,57],{"class":173},[129,1940,1941],{"class":143},"!",[129,1943,148],{"class":164},[129,1945,1946],{"class":143}," ||",[129,1948,1949],{"class":143}," !",[129,1951,148],{"class":164},[129,1953,25],{"class":143},[129,1955,309],{"class":139},[129,1957,1958],{"class":173},"()) ",[129,1960,1961],{"class":160},"return",[129,1963,948],{"class":528},[129,1965,189],{"class":143},[129,1967,1968],{"class":131,"line":1014},[129,1969,1672],{"emptyLinePlaceholder":1671},[129,1971,1972,1974,1976,1979,1981,1984,1986,1989,1991,1994,1996,1998,2000,2002],{"class":131,"line":1023},[129,1973,1936],{"class":160},[129,1975,57],{"class":173},[129,1977,1978],{"class":143},"typeof",[129,1980,895],{"class":164},[129,1982,1983],{"class":143}," !==",[129,1985,179],{"class":143},[129,1987,1988],{"class":215},"undefined",[129,1990,176],{"class":143},[129,1992,1993],{"class":143}," &&",[129,1995,895],{"class":164},[129,1997,25],{"class":143},[129,1999,900],{"class":164},[129,2001,532],{"class":173},[129,2003,987],{"class":143},[129,2005,2006,2009,2011,2013,2015,2017,2019,2021,2023,2025,2027,2029,2031,2033,2035,2037,2039,2041,2043,2045,2047],{"class":131,"line":1552},[129,2007,2008],{"class":135},"    const",[129,2010,886],{"class":164},[129,2012,889],{"class":143},[129,2014,892],{"class":143},[129,2016,895],{"class":164},[129,2018,25],{"class":143},[129,2020,900],{"class":139},[129,2022,144],{"class":173},[129,2024,176],{"class":143},[129,2026,907],{"class":215},[129,2028,176],{"class":143},[129,2030,912],{"class":143},[129,2032,915],{"class":143},[129,2034,918],{"class":173},[129,2036,921],{"class":143},[129,2038,179],{"class":143},[129,2040,926],{"class":215},[129,2042,176],{"class":143},[129,2044,931],{"class":143},[129,2046,151],{"class":173},[129,2048,189],{"class":143},[129,2050,2051,2054,2056,2058,2060],{"class":131,"line":1717},[129,2052,2053],{"class":135},"    let",[129,2055,943],{"class":164},[129,2057,889],{"class":143},[129,2059,948],{"class":528},[129,2061,189],{"class":143},[129,2063,2064,2067,2069,2071,2073,2075,2077,2079,2081,2083,2085,2087,2089,2091],{"class":131,"line":1723},[129,2065,2066],{"class":160},"    for",[129,2068,57],{"class":173},[129,2070,960],{"class":135},[129,2072,915],{"class":143},[129,2074,965],{"class":164},[129,2076,931],{"class":143},[129,2078,970],{"class":143},[129,2080,886],{"class":164},[129,2082,25],{"class":143},[129,2084,977],{"class":139},[129,2086,144],{"class":173},[129,2088,148],{"class":164},[129,2090,984],{"class":173},[129,2092,987],{"class":143},[129,2094,2095,2098,2100,2102,2104,2106],{"class":131,"line":1736},[129,2096,2097],{"class":160},"      if",[129,2099,57],{"class":173},[129,2101,997],{"class":164},[129,2103,532],{"class":173},[129,2105,1002],{"class":164},[129,2107,1005],{"class":143},[129,2109,2110],{"class":131,"line":1741},[129,2111,2112],{"class":143},"    }\n",[129,2114,2115,2118,2120],{"class":131,"line":1747},[129,2116,2117],{"class":160},"    return",[129,2119,943],{"class":164},[129,2121,189],{"class":143},[129,2123,2124],{"class":131,"line":1760},[129,2125,1011],{"class":143},[129,2127,2128],{"class":131,"line":1766},[129,2129,1672],{"emptyLinePlaceholder":1671},[129,2131,2132],{"class":131,"line":1771},[129,2133,2134],{"class":223},"  \u002F\u002F Fallback: Unicode Property Escapes (all modern browsers, no IE)\n",[129,2136,2137,2139,2141,2143,2145,2147,2149,2151,2153,2155,2157,2159,2161,2163,2165,2167,2169],{"class":131,"line":1777},[129,2138,161],{"class":160},[129,2140,57],{"class":173},[129,2142,148],{"class":164},[129,2144,25],{"class":143},[129,2146,508],{"class":139},[129,2148,144],{"class":173},[129,2150,321],{"class":143},[129,2152,704],{"class":164},[129,2154,707],{"class":215},[129,2156,327],{"class":143},[129,2158,712],{"class":528},[129,2160,532],{"class":173},[129,2162,535],{"class":143},[129,2164,538],{"class":173},[129,2166,25],{"class":143},[129,2168,186],{"class":164},[129,2170,189],{"class":143},[129,2172,2173],{"class":131,"line":1790},[129,2174,195],{"class":143},[12,2176,2177,2178,2180,2181,2183],{},"Two things to notice. First, the early return on empty\u002Fwhitespace-only input — ",[15,2179,24],{}," on an empty string returns zero segments, but the guard is explicit. Second, the feature check for ",[15,2182,24],{}," instead of a try\u002Fcatch — cleaner and cheaper.",[27,2185],{},[30,2187,2189],{"id":2188},"faq","FAQ",[112,2191,2193],{"id":2192},"whats-the-most-accurate-way-to-count-words-in-javascript","What's the most accurate way to count words in JavaScript?",[12,2195,2196,1852,2198,2201,2202,2204,2205,2207],{},[15,2197,24],{},[15,2199,2200],{},"granularity: 'word'"," is the most accurate. It's a W3C standard API built into V8 and handles CJK, Arabic, Thai (no whitespace boundaries), emoji clusters, hyphenated words, and contractions correctly. For most English-only cases, ",[15,2203,1307],{}," with the ",[15,2206,737],{}," flag is a solid, simpler alternative.",[112,2209,2211,2212,2214],{"id":2210},"why-does-textsplit-length-return-the-wrong-count","Why does ",[15,2213,17],{}," return the wrong count?",[12,2216,2217,2218,453,2221,2224,2225,2227,2228,2231,2232,2234],{},"Three reasons. First, it counts empty strings when text has consecutive spaces — ",[15,2219,2220],{},"'hello  world'.split(' ')",[15,2222,2223],{},"['hello', '', 'world']",", length 3, not 2. Second, it misses tabs (",[15,2226,238],{},"), newlines (",[15,2229,2230],{},"\\n","), and non-breaking spaces (U+00A0). Third, it counts leading\u002Ftrailing whitespace as phantom words unless you ",[15,2233,361],{}," first.",[112,2236,2238,2239,2241],{"id":2237},"does-bwbg-work-for-non-english-text","Does ",[15,2240,1288],{}," work for non-English text?",[12,2243,2244,2245,559,2247,2249,2250,2204,2252,2254],{},"No. ",[15,2246,518],{},[15,2248,562],{},". Every Cyrillic, Arabic, Greek, Hebrew, Korean, Chinese, and Japanese character returns zero matches. If you're building anything for a non-US audience, this regex silently produces wrong counts. Use ",[15,2251,1307],{},[15,2253,737],{}," flag instead.",[112,2256,2258],{"id":2257},"what-is-intlsegmenter-and-is-it-safe-to-use-in-production","What is Intl.Segmenter and is it safe to use in production?",[12,2260,2261,2263],{},[15,2262,24],{}," is a W3C Internationalization API built into V8 (Chrome\u002FNode.js), SpiderMonkey (Firefox), and JavaScriptCore (Safari). It reached Baseline 2023 status — all three major browser engines support it. For Node.js, it's available from v16.0.0 onwards. You can use it without a polyfill in any modern environment.",[112,2265,2267],{"id":2266},"how-should-i-count-words-in-a-nuxt-or-react-app-with-large-texts","How should I count words in a Nuxt or React app with large texts?",[12,2269,2270,2271,2273],{},"For texts under ~50,000 words, any method runs fast enough on the main thread. For larger inputs — manuscripts, pasted books, bulk processing — offload to a Web Worker so you don't freeze the UI. Pass the raw string via ",[15,2272,1387],{}," and run the segmenter in the worker context.",[112,2275,2277],{"id":2276},"do-contractions-count-as-one-word-or-two","Do contractions count as one word or two?",[12,2279,2280,2281,97,2283,97,2286,2289,2290,1852,2292,2294,2295,2297],{},"In English prose, contractions (",[15,2282,96],{},[15,2284,2285],{},"it's",[15,2287,2288],{},"you're",") should count as one word — that matches how editors, teachers, and publishers count. ",[15,2291,24],{},[15,2293,2200],{}," correctly returns ",[15,2296,96],{}," as a single word segment.",[112,2299,2301],{"id":2300},"why-doesnt-my-counter-match-google-docs-or-microsoft-word","Why doesn't my counter match Google Docs or Microsoft Word?",[12,2303,2304,2305,1852,2307,2309],{},"Google Docs and Microsoft Word use proprietary tokenization algorithms that aren't publicly documented. Google Docs typically excludes footnotes and counts hyphenated compounds as a single word. Word includes footnotes by default and may split hyphenated words differently depending on the language pack installed. ",[15,2306,24],{},[15,2308,997],{}," gives the closest approximation to industry-standard editorial counting — and unlike either platform, its behavior is fully predictable because the W3C spec is public.",[112,2311,2313],{"id":2312},"how-do-i-count-words-without-counting-numbers-or-punctuation-only-tokens","How do I count words without counting numbers or punctuation-only tokens?",[12,2315,2316,2317,2319,2320,2323,2324,2326],{},"With ",[15,2318,24],{},", check ",[15,2321,2322],{},"segment.isWordLike === true"," — the API marks punctuation and spaces as non-word segments automatically. With ",[15,2325,1307],{},", only letter sequences match by definition, so numbers and standalone punctuation are excluded.",[27,2328],{},[12,2330,2331,2332,2336],{},"For a deeper look at how regex patterns behave on real text, the ",[1188,2333,2335],{"href":2334},"\u002Fblog\u002Fregex-find-replace-guide","Regex Find & Replace Guide"," covers capture groups, quantifiers, and lookaheads — everything you need to build robust text-processing patterns beyond word counting.",[2338,2339,2340],"style",{},"html pre.shiki code .spNyl, html code.shiki .spNyl{--shiki-light:#9C3EDA;--shiki-default:#C792EA;--shiki-dark:#C792EA}html pre.shiki code .s2Zo4, html code.shiki .s2Zo4{--shiki-light:#6182B8;--shiki-default:#82AAFF;--shiki-dark:#82AAFF}html pre.shiki code .sMK4o, html code.shiki .sMK4o{--shiki-light:#39ADB5;--shiki-default:#89DDFF;--shiki-dark:#89DDFF}html pre.shiki code .sHdIc, html code.shiki .sHdIc{--shiki-light:#90A4AE;--shiki-light-font-style:italic;--shiki-default:#EEFFFF;--shiki-default-font-style:italic;--shiki-dark:#BABED8;--shiki-dark-font-style:italic}html pre.shiki code .s7zQu, html code.shiki .s7zQu{--shiki-light:#39ADB5;--shiki-light-font-style:italic;--shiki-default:#89DDFF;--shiki-default-font-style:italic;--shiki-dark:#89DDFF;--shiki-dark-font-style:italic}html pre.shiki code .sTEyZ, html code.shiki .sTEyZ{--shiki-light:#90A4AE;--shiki-default:#EEFFFF;--shiki-dark:#BABED8}html pre.shiki code .swJcz, html code.shiki .swJcz{--shiki-light:#E53935;--shiki-default:#F07178;--shiki-dark:#F07178}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sfazB, html code.shiki .sfazB{--shiki-light:#91B859;--shiki-default:#C3E88D;--shiki-dark:#C3E88D}html pre.shiki code .sHwdD, html code.shiki .sHwdD{--shiki-light:#90A4AE;--shiki-light-font-style:italic;--shiki-default:#546E7A;--shiki-default-font-style:italic;--shiki-dark:#676E95;--shiki-dark-font-style:italic}html pre.shiki code .sbssI, html code.shiki .sbssI{--shiki-light:#F76D47;--shiki-default:#F78C6C;--shiki-dark:#F78C6C}",{"title":125,"searchDepth":157,"depth":157,"links":2342},[2343,2344,2356,2357,2358,2359,2360,2361,2362],{"id":32,"depth":157,"text":33},{"id":109,"depth":157,"text":110,"children":2345},[2346,2348,2350,2352,2354],{"id":114,"depth":192,"text":2347},"Method 1: text.split(' ').length — The Naive Split",{"id":274,"depth":192,"text":2349},"Method 2: text.trim().split(\u002F\\s+\u002F).filter(Boolean).length — The Patched Split",{"id":471,"depth":192,"text":2351},"Method 3: (text.match(\u002F\\b\\w+\\b\u002Fg) || []).length — The Classic Regex",{"id":661,"depth":192,"text":2353},"Method 4: (text.match(\u002F\\p{L}+\u002Fgu) || []).length — Unicode Property Escapes",{"id":855,"depth":192,"text":2355},"Method 5: Intl.Segmenter — The Right Answer",{"id":1204,"depth":157,"text":1205},{"id":1354,"depth":157,"text":1355},{"id":1569,"depth":157,"text":1570},{"id":1636,"depth":157,"text":1637},{"id":1843,"depth":157,"text":1844},{"id":1888,"depth":157,"text":1889},{"id":2188,"depth":157,"text":2189,"children":2363},[2364,2365,2367,2369,2370,2371,2372,2373],{"id":2192,"depth":192,"text":2193},{"id":2210,"depth":192,"text":2366},"Why does text.split(' ').length return the wrong count?",{"id":2237,"depth":192,"text":2368},"Does \u002F\\b\\w+\\b\u002Fg work for non-English text?",{"id":2257,"depth":192,"text":2258},{"id":2266,"depth":192,"text":2267},{"id":2276,"depth":192,"text":2277},{"id":2300,"depth":192,"text":2301},{"id":2312,"depth":192,"text":2313},"Dev Tools","From naive split() to Intl.Segmenter — 5 JavaScript word count methods benchmarked for accuracy, Unicode support, and edge cases. Know which one to ship.","md",[2378,2380,2382,2384,2386,2388,2390],{"question":2193,"answer":2379},"Intl.Segmenter with granularity: 'word' is the most accurate method. It's a W3C standard API built into V8 and handles CJK, Arabic, Thai (which have no whitespace boundaries), as well as emoji clusters, hyphenated words, and contractions correctly. For most English-only use cases, \u002F\\p{L}+\u002Fgu with the u flag is a solid and simpler alternative.",{"question":2366,"answer":2381},"Three reasons. First, it counts empty strings when your text has consecutive spaces — 'hello  world'.split(' ') returns ['hello', '', 'world'], length 3, not 2. Second, it misses tabs (\\t), newlines (\\n), and non-breaking spaces (U+00A0) as word separators. Third, it includes punctuation attached to words as part of the word token, which distorts unique-word metrics.",{"question":2368,"answer":2383},"No. In JavaScript, \\w is shorthand for [A-Za-z0-9_] — it matches only ASCII letters, digits, and underscore. Every Cyrillic, Arabic, Greek, Hebrew, Korean, Chinese, and Japanese character returns zero matches. If you're building anything for a non-US audience, this regex silently produces wrong counts. Use \u002F\\p{L}+\u002Fgu with the u flag instead.",{"question":2258,"answer":2385},"Intl.Segmenter is a W3C Internationalization API built into V8 (Chrome\u002FNode.js), SpiderMonkey (Firefox), and JavaScriptCore (Safari). It reached Baseline 2023 status — meaning all three major browser engines support it. For Node.js, it's available from v16.0.0 onwards. You can use it without a polyfill in any modern environment.",{"question":2267,"answer":2387},"For texts under ~50,000 words, any of the methods run fast enough on the main thread. For larger inputs — manuscripts, pasted books, bulk processing — offload to a Web Worker so you don't freeze the UI. The Intl.Segmenter API is not transferable to workers directly, so pass the raw string via postMessage and run the segmenter in the worker context.",{"question":2277,"answer":2389},"In English prose, contractions (don't, it's, you're) should count as one word — that matches how editors, teachers, and publishers count. Methods 1–3 (split-based and \\w regex) get this right by accident since the apostrophe separates them into two tokens only if the apostrophe is treated as whitespace, which it isn't. Intl.Segmenter with granularity: 'word' correctly returns don't as a single word segment.",{"question":2313,"answer":2391},"Filter the results of your segmenter or regex to only match segments that contain at least one Unicode letter. With Intl.Segmenter, check segment.isWordLike === true — the API marks punctuation and spaces as non-word segments automatically. With \u002F\\p{L}+\u002Fgu, only letter sequences match by definition, so numbers and standalone punctuation are excluded.","\u002Farticles\u002Fhow-to-count-words-javascript\u002Fhero.webp",{},"\u002Fen\u002Fhow-to-count-words-javascript","2026-05-12",{"title":5,"description":2375},"en\u002Fhow-to-count-words-javascript",[2399,2400,2401,2402,2403],"javascript","word count","intl segmenter","unicode","text processing","-NycTQSEgM8fmIkiYxmhrMlYQ0OgmuLRQsfWt-6sOuo",1782712871565]