Tag: statistics
-
Deep networks learn to parse uniform-depth context-free languages from local statistics
Deep networks learn to parse uniform-depth context-free languages from local statistics arXiv:2602.06065v1 Announce Type: new Abstract: Understanding how the structure of language can be learned from sentences alone is a central question in both cognitive science and machine learning. Studies of the internal representations of Large Language Models (LLMs) support their ability to parse text…
-
“Rebuilding” Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training
“Rebuilding” Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training arXiv:2601.17510v1 Announce Type: new Abstract: This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, “Statistics in the Age of AI,” which convened leading statisticians to discuss how the field is evolving in…
-
Inferential Statistics on long-form census data from stats can
Inferential Statistics on long-form census data from stats can I am using the following tool https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=9810065601 to query Statistics Canada and get data from the long-form census. However, since it’s a census of 25% of the population, there is a need for inferential statistics. That being said in order to do inferential statistics on the…
-
Asynchronous Gossip Algorithms for Rank-Based Statistical Methods
Asynchronous Gossip Algorithms for Rank-Based Statistical Methods arXiv:2509.07543v1 Announce Type: new Abstract: As decentralized AI and edge intelligence become increasingly prevalent, ensuring robustness and trustworthiness in such distributed settings has become a critical issue-especially in the presence of corrupted or adversarial data. Traditional decentralized algorithms are vulnerable to data contamination as they typically rely on…
-
On computing and the complexity of computing higher-order $U$-statistics, exactly
On computing and the complexity of computing higher-order $U$-statistics, exactly arXiv:2508.12627v1 Announce Type: new Abstract: Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper…
-
How’s the job market for Bayesian statistics?
How’s the job market for Bayesian statistics? I’m a data scientist with 1 YOE. mostly worked on credit scoring models, sql, and Power BI. Lately, I’ve been thinking of going deeper into bayesian statistics and I’m currently going through the statistical rethinking book. But I’m wondering. is it worth focusing heavily on bayesian stats? Or…
-
The Dangers of Deceptive Data Part 2–Base Proportions and Bad Statistics
The Dangers of Deceptive Data Part 2–Base Proportions and Bad Statistics This is a follow-up to my earlier article: The Dangers of Deceptive Data–Confusing Charts and Misleading Headlines. My first article focused on how visualizations can be used to mislead, diving into a form of data presentation widely used in public matters. In this article,…
-
Generate-then-Verify: Reconstructing Data from Limited Published Statistics
Generate-then-Verify: Reconstructing Data from Limited Published Statistics arXiv:2504.21199v1 Announce Type: new Abstract: We study the problem of reconstructing tabular data from aggregate statistics, in which the attacker aims to identify interesting claims about the sensitive data that can be verified with 100% certainty given the aggregates. Successful attempts in prior work have conducted studies in…