Skip to content

Commit

Permalink
Updated print
Browse files Browse the repository at this point in the history
  • Loading branch information
Pringled committed Oct 11, 2024
1 parent 8e7d66f commit e28db3e
Showing 1 changed file with 17 additions and 9 deletions.
26 changes: 17 additions & 9 deletions tutorials/semantic_deduplication.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
},
{
"cell_type": "code",
"execution_count": 83,
"execution_count": 96,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -458,54 +458,62 @@
},
{
"cell_type": "code",
"execution_count": 91,
"execution_count": 99,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Example 1:\n",
"Train text:\n",
"Jackson Squares Off With Attorney SANTA MARIA, Calif. - Fans of Michael Jackson erupted in cheers Monday as the pop star emerged from a double-decker tour bus and went into court for a showdown with the prosecutor who has pursued him for years on child molestation charges...\n",
"Test text:\n",
"Jackson Squares Off With Prosecutor SANTA MARIA, Calif. - Fans of Michael Jackson erupted in cheers Monday as the pop star emerged from a double-decker tour bus and went into court for a showdown with the prosecutor who has pursued him for years on child molestation charges...\n",
"Differences:\n",
"- Attorney + Prosecutor\n",
"--------------------------------------------------\n",
"Example 2:\n",
"Train text:\n",
"Cassini Spies Two Moons Around Saturn (AP) AP - NASA's Cassini spacecraft has spied two new little moons around satellite-rich Saturn, the space agency said.\n",
"Test text:\n",
"Cassini Spies Two Little Saturn Moons (AP) AP - NASA's Cassini spacecraft has spied two new little moons around satellite-rich Saturn, the space agency said Monday.\n",
"Differences:\n",
"+ Little + Saturn - Around - Saturn - said. + said + Monday.\n",
"--------------------------------------------------\n",
"Example 3:\n",
"Train text:\n",
"Intel to Delay Product for High-Definition TVs SAN FRANCISCO (Reuters) - In the latest of a series of product delays, Intel Corp. has postponed the launch of a video display chip it had previously planned to introduce by year end, putting off a showdown with Texas Instruments Inc. in the fast-growing market for high-definition television displays.\n",
"Test text:\n",
"Intel to delay product aimed for high-definition TVs SAN FRANCISCO -- In the latest of a series of product delays, Intel Corp. has postponed the launch of a video display chip it had previously planned to introduce by year end, putting off a showdown with Texas Instruments Inc. in the fast-growing market for high-definition television displays.\n",
"Differences:\n",
"- Delay + delay - Product + product + aimed - High-Definition + high-definition + -- - (Reuters) - -\n",
"--------------------------------------------------\n",
"Example 4:\n",
"Train text:\n",
"Staples Profit Up Sharply, to Enter China NEW YORK (Reuters) - Staples Inc. <A HREF=\"http://www.investor.reuters.com/FullQuote.aspx?ticker=SPLS.O target=/stocks/quickinfo/fullquote\">SPLS.O</A>, the top U.S. office products retailer, on Tuesday reported a 39 percent jump in quarterly profit, raised its full-year forecast and said it plans to enter the fast-growing Chinese market.\n",
"Test text:\n",
"Staples Profit Up, to Enter China Market NEW YORK (Reuters) - Staples Inc. <A HREF=\"http://www.investor.reuters.com/FullQuote.aspx?ticker=SPLS.O target=/stocks/quickinfo/fullquote\">SPLS.O</A>, the top U.S. office products retailer, on Tuesday reported a 39 percent jump in quarterly profit, raised its full-year forecast and said it plans to enter the fast-growing Chinese market, sending its shares higher.\n",
"Differences:\n",
"- Up + Up, - Sharply, + Market - market. + market, + sending + its + shares + higher.\n",
"--------------------------------------------------\n",
"Example 5:\n",
"Train text:\n",
"Stocks Climb on Drop in Consumer Prices NEW YORK - Stocks rose for a second straight session Tuesday as a drop in consumer prices Tuesday allowed investors to put aside worries about inflation, at least for the short term. With gasoline prices falling to eight-month lows, the Consumer Price Index registered a small drop in July, giving consumers a respite from soaring energy prices...\n",
"Test text:\n",
"Stocks Climb on Drop in Consumer Prices NEW YORK - Stocks rose for a second straight session Tuesday as a drop in consumer prices allowed investors to put aside worries about inflation, at least for the short term. With gasoline prices falling to eight-month lows, the Consumer Price Index registered a small drop in July, giving consumers a respite from soaring energy prices...\n",
"Differences:\n",
"- Tuesday\n",
"--------------------------------------------------\n"
]
}
],
"source": [
"# Show a few duplicates with their originals, highlighting word-level differences\n",
"num_examples = 5\n",
"for i, test_idx in enumerate(duplicate_indices_in_test[:num_examples]):\n",
" train_idx = duplicate_to_original_mapping[test_idx]\n",
" print(f\"Example {i + 1}:\")\n",
"\n",
" print(f\"Train text:\\n{texts_train[train_idx]}\")\n",
" print(f\"Test text:\\n{texts_test[test_idx]}\")\n",
" print(\"-\" * 50)"
" print(\"Differences:\")\n",
" print(display_word_differences(texts_train[train_idx], texts_test[test_idx]))\n",
" print(\"-\" * 50)\n"
]
},
{
Expand Down

0 comments on commit e28db3e

Please sign in to comment.