Parser.getSentencePositionScore() off-by-one #10

schmamps · 2018-05-11T20:22:47Z

In parser.py:

    # Jagadeesh, J., Pingali, P., & Varma, V. (2005). Sentence Extraction Based Single Document Summarization. International Institute of Information Technology, Hyderabad, India, 5.
    def getSentencePositionScore(self, i, sentenceCount):
        normalized = i / (sentenceCount * 1.0)

        if normalized > 0 and normalized <= 0.1:
            return 0.17
        # ... snip ....
        elif normalized > 0.8 and normalized <= 0.9:
            return 0.04
        elif normalized > 0.9 and normalized <= 1.0:
            return 0.15
        else:
            return 0

When Summarizer scores sentences (Summarizer.computeScore()) it iterates through the list and passes the index as its argument. Notice what happens when you unroll that loop, though:

# where: ['sentence 1', 'sentence 2', ..., 'sentence 10']
sentences = ['sentence ' + str(x + 1) for x in range(10)]

# first sentence, returns 0
sentencePosition = parser.getSentencePositionScore(0, len(sentences))
# ...
# last sentence, returns 0.04
sentencePosition = parser.getSentencePositionScore(9, len(sentences))

Instinct tells me that a (next-to-)zero score for the introductory and concluding sentences is wrong. Adding to my suspicions, sum of all cases in the referenced paper are 0 < normalized <= 1.0.

To me this looks like an off-by-one bug. Nine of the ten sentences above are caught one case too early and the first falls to default. That shouldn't even happen with valid arguments!

If I'm opinionated about this as an issue, respectable people can differ about where to fix it. One could make a case for Summarizer.computeScore() but Parser.getSentencePositionScore() is already parameterized for index (zero-based, presumably) so fixing it there has the smallest impact.

Correct sentence position scoring

596646c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parser.getSentencePositionScore() off-by-one #10

Parser.getSentencePositionScore() off-by-one #10

Uh oh!

schmamps commented May 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Parser.getSentencePositionScore() off-by-one #10

Are you sure you want to change the base?

Parser.getSentencePositionScore() off-by-one #10

Uh oh!

Conversation

schmamps commented May 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant