Normal
I was able to split the quotes.csv file (in first post) into nearly 2000 symbol files and these have been zipped into the attached file mfquotes.zip. The csv file format is the same as csv files in previous post ("sptmcsv20170805.zip").I have done this so similar comparisons can be made with file quotes.csv. Also I think this is the format Premium Data use when you want to export the data to csv. That is, the symbol is the filename and the format is "date,open,high,low,close,volume". I hope that's right cause that's why I added this feature.So when we run "python3 sptmcsv.py" in the same directory as the 2000 files, we get[ATTACH=full]72313[/ATTACH]I notice the database load time here is about 5 times slower than when the script was run with single file "quotes.csv". This is probably because of the opening and closing of about 2000 files. Since there is only a maximum of 2 months worth of data for each security, I expect that the time difference would be a lot smaller for say 5 years worth of data. Overall, not too bad and all the other statistics are pretty much the same.And now on to using just plain python (without shared libraries). I now want to answer the question "Why don't I just use pandas library to do all this?".So what I plan to do using pandas is create a dictionary of dataframes where the key is the symbol and the value is the dataframe containing the price data and the input files follow the same 2 formats discussed earlier. We want to compare the iterating through all 2000 symbols with my shared lib solution. The main comparison is the Simple Moving Average. I don't want to use the built in libraries, because the programmer must be able to create their own indicators.Stay tuned.Andrew
I was able to split the quotes.csv file (in first post) into nearly 2000 symbol files and these have been zipped into the attached file mfquotes.zip. The csv file format is the same as csv files in previous post ("sptmcsv20170805.zip").
I have done this so similar comparisons can be made with file quotes.csv. Also I think this is the format Premium Data use when you want to export the data to csv. That is, the symbol is the filename and the format is "date,open,high,low,close,volume". I hope that's right cause that's why I added this feature.
So when we run "python3 sptmcsv.py" in the same directory as the 2000 files, we get
[ATTACH=full]72313[/ATTACH]
I notice the database load time here is about 5 times slower than when the script was run with single file "quotes.csv". This is probably because of the opening and closing of about 2000 files. Since there is only a maximum of 2 months worth of data for each security, I expect that the time difference would be a lot smaller for say 5 years worth of data. Overall, not too bad and all the other statistics are pretty much the same.
And now on to using just plain python (without shared libraries). I now want to answer the question "Why don't I just use pandas library to do all this?".
So what I plan to do using pandas is create a dictionary of dataframes where the key is the symbol and the value is the dataframe containing the price data and the input files follow the same 2 formats discussed earlier. We want to compare the iterating through all 2000 symbols with my shared lib solution. The main comparison is the Simple Moving Average. I don't want to use the built in libraries, because the programmer must be able to create their own indicators.
Stay tuned.
Andrew
Hello and welcome to Aussie Stock Forums!
To gain full access you must register. Registration is free and takes only a few seconds to complete.
Already a member? Log in here.