Generic Timestamp Converter
- Without any additional option it converts the first column from format
%d/%m/%Y:%H.%M.%Sto seconds since epoch and prints the rest of columns as the original output. - Default input is stdin. Use -i to set a filename.
- Default output is stdout. Use -o to set a filename.
- If there are more timestamp columns, use -t to specify them.
- If the timestamp can have multiple formats, use -f to specify a list of them between double quotes.
- Keep in mind that these options affect the performance.
- The default separator is the semicolon. For changing it, use the option -s.
- If you want to exclude lines based on the timestamp, use --start and --end to specify the interval (in seconds since epoch).
- If there are multiple timestamps columns, specify the main timestamp using the option --main_ts.
- In order to set which columns to exclude or include, use options --exclude and --include.
- For printing milliseconds since epoch instead of seconds, use --ms.
It uses the format from Python's strptime function.
| Directive | Meaning |
|---|---|
| %a | Weekday as locale’s abbreviated name. |
| %A | Weekday as locale’s full name. |
| %w | Weekday as a decimal number, where 0 is Sunday and 6 is Saturday. |
| %d | Day of the month as a zero-padded decimal number. |
| %b | Month as locale’s abbreviated name. |
| %B | Month as locale’s full name. |
| %m | Month as a zero-padded decimal number. |
| %y | Year without century as a zero-padded decimal number. |
| %Y | Year with century as a decimal number. |
| %H | Hour (24-hour clock) as a zero-padded decimal number. |
| %I | Hour (12-hour clock) as a zero-padded decimal number. |
| %p | Locale’s equivalent of either AM or PM. |
| %M | Minute as a zero-padded decimal number. |
| %S | Second as a zero-padded decimal number. |
| %f | Microsecond as a decimal number, zero-padded on the left. |
| %j | Day of the year as a zero-padded decimal number. |
| %U | Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0. |
| %W | Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0. |
| %c | Locale’s appropriate date and time representation. |
| %x | Locale’s appropriate date representation. |
| %X | Locale’s appropriate time representation. |
| %% | A literal '%' character. |
%z is not available for function datetime.strptime in Python 2.x :-(
I plan to port this code to Python 3. I want to code it purely on Python in order to execute it with pypy.
Python 2.7.10 and Pypy 5.9.0 were used.
Pypy used the following options: --jit vec=1 --jit vec_all=1.
Also used the following trick which oddly boosts speed by a lot. os.environ['TZ'] = 'GMT' For example, from 95.6Klines/s to 129Klines/s.
Change python with pypy or python3 accordingly
yes "2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53" | python convert_ts.py -t 0 -f "%Y/%m/%d %H:%M:%S" | pv -l | head -2000000 > /dev/null yes "2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53;2017/12/06 22:46:53" | python convert_ts.py -t 0 1 2 3 4 -f "%Y/%m/%d %H:%M:%S" | pv -l | head -2000000 > /dev/null 