<p>One of the most interesting lessons from the pandemic is the harm that can be caused by <a href="https://en.wikipedia.org/wiki/Data_dredging">p-hacking</a>. A paper with errors related to p-hacking that hasn’t been peer-reviewed is promoted by one or more people with millions of followers on social media and then some of those followers suffer horrible outcomes because they had a false sense of security. Maybe the authors of the paper did not even realize the problem but for whatever reason, the social media rock stars felt the need to spread the misinformation. And another very interesting lesson is that the social media rock stars seem to almost never issue a correction after the paper is reviewed and rejected.</p>
<p>To illustrate p-hacking with a non-serious example, I am using real public data with my experience attending drop-in hockey.</p>
<p>I wanted to know if goalies tended to show up more or less frequently on any particular day of the week because it is more fun to play when at least one goalie shows up. I collected 85 independent samples.</p>
<p>For all days, there were 27 days with 1 goalie and 27 days with 2 goalies and 31 days with 0 goalies.</p>
<p>Our test procedure will define the test statistic X = the number of days that at least one goalie registered.</p>
<p>I am not smart so instead of committing to a hypothesis to test prior to looking at the data, I cheat and look at the data first and notice that the numbers for Tuesday look especially low. So, I focus on goalie registrations on Tuesdays. Using the data above for all days, the null hypothesis <script type="math/tex">H_0</script> is that the probability that at least one goalie registered on a Tuesday is 0.635.</p>
<p>For perspective, taking 19 samples for Tuesday would give an expected value of 12 samples where at least 1 goalie registered.</p>
<p>Suppose we wanted to propose an alternative hypothesis that p < 0.635 for Tuesday. What is the rejection region of values that would refute the null hypothesis (p=0.635)?</p>
<p>Let’s aim for α = 0.05 as the level of significance. This means that (pretending that I had not egregiously cherry-picked data beforehand) we want there to be less than a 5% chance that the experimental result would occur inside the rejection region if the null hypothesis was true (Type I error).</p>
<p>For a binomial random variable X, the pmf b(x; n, p) is</p>
<script type="math/tex; mode=display">{n \choose x} p^x (1-p)^{n-x}</script>
<div class="highlighter-rouge"><pre class="highlight"><code>def factorial(n)
(1..n).inject(:*) || 1
end
def combination(n,k)
factorial(n) / (factorial(k)*factorial(n-k))
end
def pmf(x,n,p)
combination(n,x) * (p ** x) * ((1 - p) ** (n-x))
end
</code></pre>
</div>
<p>The cdf B(x; n, p) = P(X <script type="math/tex">\le</script> x) is</p>
<script type="math/tex; mode=display">\sum_{i=0}^x b(i; n, p)</script>
<div class="highlighter-rouge"><pre class="highlight"><code>def cdf(x,n,p)
(0..x).map {|i| pmf(i,n,p)}.sum
end
</code></pre>
</div>
<p>For n=19 samples, if x <script type="math/tex">\le</script> 9 was chosen as the rejection region, then α = P(X <script type="math/tex">\le</script> 9 when X ~ Bin(19, 0.635)) = 0.112</p>
<div class="highlighter-rouge"><pre class="highlight"><code>2.4.10 :001 > load 'stats.rb'
=> true
2.4.10 :002 > cdf(9,19,0.635)
=> 0.1121416295262306
</code></pre>
</div>
<p>This choice is not good enough because even if the null hypothesis is true, there is a large 11% chance (again, pretending I had not cherry-picked the data) that the test statistic falls in the rejection region.</p>
<p>So, if we narrow the rejection region to x <script type="math/tex">\le</script> 8, then α = P(X <script type="math/tex">\le</script> 8 when X ~ Bin(19, 0.635)) = 0.047</p>
<div class="highlighter-rouge"><pre class="highlight"><code>2.4.10 :003 > cdf(8,19,0.635)
=> 0.04705965393607316
</code></pre>
</div>
<p>This rejection region satisfies the requirement of a 0.05 significance level.</p>
<p>The n=19 samples for Tuesday are [0, 0, 0, 1, 0, 0, 0, 0, 1, 2, 0, 1, 1, 0, 1, 2, 2, 0, 0].</p>
<p>Since x=8 falls within the rejection region, the null hypothesis is (supposedly) rejected for Tuesday samples. So I announce to my hockey friends on social media “Beware! Compared to all days of the week, it is less likely that at least one goalie will register on a Tuesday!”</p>
<p><img src="https://coffeebucks.s3.amazonaws.com/nope_not_how_it_works.jpg" alt="Nope, not how it works" /></p>
<p>Before addressing the p-hacking, let’s first address another issue. The experimental result was x = 8 which gave us a 0.047 probability of obtaining 8 or less days in a sample of 19 assuming that the null hypothesis (p=0.635) is true. This result just barely makes the 0.05 cutoff by the skin of its teeth. So, just saying that the null hypothesis was refuted with α = 0.05 does not reveal that it was barely refuted. Therefore, it is much more informative to say the <a href="https://en.wikipedia.org/wiki/P-value">p-value</a> was 0.047 and also does not impose a particular α on readers who want to draw their own conclusions.</p>
<p>Now let’s discuss the p-hacking problem. I gave myself the impression that there was only a 5% chance that I would see a significant result even if the null hypothesis (p=0.635) were true. However, since there is data for 5 days (Monday, Tuesday, Wednesday, Thursday, Friday), I could have performed 5 different tests. If I chose that same p < 0.635 alternative hypothesis for each, then there would similarly be a 5% chance of a significant result for each test. The probability that all 5 tests would not be significant would be 0.95 * 0.95 * 0.95 * 0.95 * 0.95 = 0.77. Therefore, the probability that at least one test would be significant is 1 - 0.77 = 0.23 (the <a href="https://en.wikipedia.org/wiki/Family-wise_error_rate">Family-wise error rate</a>) which is much higher than 0.05. That’s like flipping a coin twice and getting two heads which is not surprising at all. We should expect such a result even if the null hypothesis is true. Therefore, there is not a significant result for Tuesday.</p>
<p>I was inspired to write this blog post after watching <a href="https://twitter.com/DrSusanOliver1">Dr. Susan Oliver</a>’s <a href="https://www.youtube.com/watch?v=drSAsfuMkuw">Antivaxxers fooled by P-Hacking and apples to oranges comparisons</a>. The video references the paper <a href="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106">The Extent and Consequences of P-Hacking in Science</a> (2015).</p>
<p>A few weeks ago, I ordered a <a href="https://www.staples.com/3m-aura-n95-particulate-respirator-white-20-pack-9205p-20-dc/product_24503827">box of N95 masks</a> as I had been following the rising positivity rate. Both <a href="https://www.washingtonpost.com/health/2022/05/25/long-covid-vaccines-slight-protection/">Vaccines may not prevent many symptoms of long covid, study suggests</a> and <a href="https://www.washingtonpost.com/health/2022/05/25/long-covid-vaccines-slight-protection/">1 Of 5 With Covid May Develop Long Covid, CDC Finds</a> were also persuasive.</p>
<p>A long time ago, I read <a href="https://openlibrary.org/isbn/0060976519">The Language Instinct</a>. Inside the back page of my book are notes with page numbers. This is a practice I learned from a <a href="https://openlibrary.org/books/OL7563133M/World_Is_My_Home">book</a> by <a href="https://en.wikipedia.org/wiki/James_A._Michener">James Michener</a>. At some point, I started sharing in conversations something I learned. Unfortunately, I had not made a note for this that I could check and the information was more complex than I remembered. Since I had shared this more than once, I thought I should really find the reference and it was not easy but I found it on page 293. The first part I had right.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>In sum, acquisition of a normal language is guaranteed for children up to the age of six, is steadily compromised from then until shortly after puberty, and is rare thereafter.
</code></pre>
</div>
<p>Here is the part I screwed up.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>We do know that the language-learning circuitry of the brain is more plastic in childhood; children learn or recover language when the left hemisphere of the brain is damaged or even surgically removed (though not quite at normal levels), but comparable damage in an adult usually leads to permanent aphasia.
</code></pre>
</div>
<p>While this itself is fascinating to me, I had been embellishing the story to say language is acquired in the brain’s right hemisphere in children and the left for adults. Now that I’m rereading it after so many years, it is clear that the book says this can happen but is not necessarily so.</p>
<p>My blog is now linking to <a href="https://openlibrary.org">openlibrary.org</a> for <a href="https://indieweb.org/read">read posts</a>. If you have the book’s ISBN, then it is trivial to link to openlibrary’s page for your book. It would be cool if those pages accepted webmention so that you could see who is reading the book.</p>
<p><img src="https://coffeebucks.s3.amazonaws.com/c_wright_mills.png" alt="Artillery Battery A" /></p>
<p>On Monday, there were a few people in my Twitter feed sharing <a href="https://www.thebatt.com/news/the-rudder-association-a-deep-dive-into-the-conservative-former-student-group-with-plans-to/article_ee9f31ec-9dae-11ec-a4cc-efe4856b436c.html">Texas A&M’s Battalion article about The Rudder Association</a>. While Texas A&M has improved so much over the years, this stealthy group called the Rudder Association is now embarrassing the school. I was glad to read the article and reassured that the kids are alright. I couldn’t help but be reminded of the letters written to the Battalion in 1935 by a freshman named <a href="https://en.wikipedia.org/wiki/C._Wright_Mills">C. Wright Mills</a>.</p>
<blockquote>
<p>College students are supposed to become leaders of thought and action in later life. It is expected they will profit from a college education by developing an open and alert mind to be able to cope boldly with everyday problems in economics and politics. They cannot do this unless they learn to think independently for themselves and to stand fast for their convictions. Is the student at A and M encouraged to do this? Is he permitted to do it? The answer is sadly in the negative.</p>
</blockquote>
<p>Little did he know that current students would be dealing with this shit 85 years later with a group of former students with nothing better to do than infiltrate student-run organizations from freshman orientation to the newspaper. But shocking no one, they were too incompetent to maintain the privacy of the school regents who met with them.</p>
<blockquote>
<p>According to meeting minutes from Dec. 1, 2020, the Rudder Association secured the attendance of four members of the A&M System Board of Regents. The meeting minutes obtained by The Battalion were censored by TRA to remove the names of the regents in the meeting as well as other “highly sensitive information.”</p>
</blockquote>
<blockquote>
<p>“DO NOT USE THEIR NAMES BEYOND THE RUDDER BOARD. They do not wish to be outed,” the minutes read on the regents in attendance.</p>
</blockquote>
<blockquote>
<p>Further examination by The Battalion revealed, however, that the censored text could be copied and pasted into a text document to be viewed in its entirety due to TRA using a digital black highlighter to censor.</p>
</blockquote>
<p>Well done, Battalion.</p>
<p>(photo is from <a href="https://openlibrary.org/works/OL14863138W/C._Wright_Mills">C. Wright Mills: Letters and autobiographical writings</a>)</p>
<p>I read <a href="https://jackjamieson.net/259929-2/">Bridging the Open Web and APIs: Alternative Social Media Alongside the Corporate Web</a> because it was a good opportunity to fill some holes in my knowledge about the <a href="https://indieweb.org">Indieweb</a> and Facebook.</p>
<p><a href="https://brid.gy/">Brid.gy</a> enables people to syndicate their posts from their own site to large proprietary social media sites.</p>
<p>Although I don’t use it myself, I’m often impressed when I see all the Twitter “likes” and responses that are <a href="https://indieweb.org/backfeed">backfed</a> by brid.gy to the canonical post on a personal website.</p>
<p>The paper details the challenging history of providing the same for Facebook (in which even Cambridge Analytica plays a part) and helped me appreciate why I never see similar responses from Facebook on personal websites these days.</p>
<p>It ends on a positive note…</p>
<blockquote>
<p>while Facebook’s API shutdown led to an overnight decrease in Bridgy accounts (Barrett, 2020), other platforms with which Bridgy supports POSSE remain functional and new platforms have been added, including Meetup, Reddit, and Mastodon.</p>
</blockquote>
<p>Although it makes a good point, the "False balance" article seems to accept the widely held assumption that Rogan is just "letting people voice their views" without interrupting them but he did so recently with guest Josh Szeps to wrongly argue against covid myocarditis evidence.</p>
