Skip to content

Conversation

@BurdetteLamar
Copy link
Member

Adds a command-line interface for CSV::filter; this supports using CSV without having to write Ruby code.

In its present form, this is a true filter: reads from stdin, writes to stdout. More could be done: input could be from ARGF or an IO stream; output could be to IO stream.

@BurdetteLamar
Copy link
Member Author

I'm not sure what to do about these failures. I chose to require 'minitest/autorun' because it has the assertion method capture_subprocess_io, which I've used to evaluate the output.

Suggestions welcome.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interested suggestion. But filter is too generic for command line name to handle only CSV.

How about adding csv- prefix or something?

# -*- coding: utf-8 -*-
# frozen_string_literal: false

require 'minitest/autorun'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using test-unit not minitest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted.

@BurdetteLamar
Copy link
Member Author

This is an interested suggestion. But filter is too generic for command line name to handle only CSV.

How about adding csv- prefix or something?

Changed to csv-filter.

@BurdetteLamar
Copy link
Member Author

Should have added earlier: I've needed substantial debugging code to get this work done, and have left it in (just in case).

@BurdetteLamar BurdetteLamar requested a review from kou November 6, 2024 14:02
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review all changes. Because there are many parts I need to add comments... Can we start from a smaller script instead of this?

bin/csv-filter Outdated

options = {}

OptionParser.new do |p|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use the following style instead of OptionParser.new do end.parse!? The .parse! part in OptionParser.new do end.parse! is easy to miss.

parser = OptionParser.new
parser.program_name = ...
parser.on(...)
parser.parse!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bin/csv-filter Outdated

OptionParser.new do |p|

p.program_name = File.basename $0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can omit this. We can use the default value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bin/csv-filter Outdated

p.program_name = File.basename $0
p.version = CSV::VERSION
p.release = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can omit this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bin/csv-filter Outdated
p.program_name = File.basename $0
p.version = CSV::VERSION
p.release = nil
p.summary_indent = ' ' * 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can omit this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bin/csv-filter Outdated
p.version = CSV::VERSION
p.release = nil
p.summary_indent = ' ' * 4
p.banner = <<-EOF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use BANNER not EOF for here document mark?
EOF (End Of File?) is difficult to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bin/csv-filter Outdated

# Returns an array of converter names.
# \String +s+ contains a comma-separated list of converter names;
# each optionally begins with a colon.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

bin/csv-filter Outdated
Comment on lines 9 to 17
def parse_converters(s)
converters = []
s.split(',').each do |name|
name.sub(/^:/, '')
sym = name.to_sym
converters.push(sym)
end
converters
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def parse_converters(s)
converters = []
s.split(',').each do |name|
name.sub(/^:/, '')
sym = name.to_sym
converters.push(sym)
end
converters
end
def parse_converters(raw_converters)
raw_converters.split(",").collect(&:to_sym)
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced and renamed.

bin/csv-filter Outdated
Comment on lines 95 to 97
p.on('--field_size_limit N',
'Maximum field size (characters)') do |value|
options[:field_size_limit] = value.to_i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
p.on('--field_size_limit N',
'Maximum field size (characters)') do |value|
options[:field_size_limit] = value.to_i
p.on('--field_size_limit N', Integer,
'Maximum field size (characters)') do |value|
options[:field_size_limit] = value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

bin/csv-filter Outdated
'Treat the first input row as headers;',
'STR is true or comma-separated headers.'
) do |value|
options[:headers] = value == 'true' ? true : parse_converters(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange that we use parse_converters for headers.
We need better name than parse_converters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

bin/csv-filter Outdated
end
p.on('--liberal_parsing',
'Attempt to parse non-conformant input.') do
options[:liberal_parsing] = :no_argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:no_argument is strange. :liberal_parsing must be true or false: p.on('--[no-]liberal-parsing', ...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol :no_argument is only internal, and tells the testing how to behave. The CSV user does not see it.

@BurdetteLamar
Copy link
Member Author

Thanks, @kou, for your interest, and for this helpful review. I'm at RubyConf in Chicago until the weekend, but will work on this when I get home.

When we're done with this review cycle, I'll be interested in adding to this work. The filter will not be very useful without a config file (maybe similar to ~/.gitconfig?) that will avoid the need for including lots of options on the command line.

@BurdetteLamar
Copy link
Member Author

BurdetteLamar commented Nov 17, 2024

I have commented out 13 of the options (in bin and test), leaving 16 in. Each of those left in is either because it's mentioned in the help or because it has comments here.

Maybe we can work with these, and do the rest piecemeal?

@kou
Copy link
Member

kou commented Nov 17, 2024

Could you remove them from this PR instead of commenting them out?

FYI: You can copy the current branch content:

git clone git@github.com:BurdetteLamar/csv.git
cd csv
git switch filter
git switch -c filter-keep
git push origin filter-keep

BTW, can remove more options from this PR to make this script smaller?

@BurdetteLamar
Copy link
Member Author

smaller

I'm thinking to close this PR and start fresh, with just --help and --version. Then we can have a new PR for each option as we go, adjusting the help text and version accordingly. Does that work for you?

@kou
Copy link
Member

kou commented Nov 18, 2024

Yes. It's better than the current PR.

@BurdetteLamar
Copy link
Member Author

Closing this; will begin again with a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants