Offering any help to members looking to get into the auto trading space.
Anything from hardware selection and optimization to networking and connectivity to data feeds, execution, etc.
I'm going to add a few topics in this thread focusing on platforms, hardware, connectivity, data & execution.
If anyone has questions post them up and I'll be happy to answer as best I can.
have had some latency hurdles lately. I think I've been able to overcome them but the coming weeks will be the sure way to tell.
have also had lots of guys email me for help so it would be nice to consolidate many of the requests here vs. private emails. Everyone is asking essentially the same questions so that's a perfect item to post so the world will know.
I have a whole bunch of questions as I look to transition from a medium frequency retail operation to something a bit more sophisticated. It's certainly not fair to expect you to answer all of them, so I'll just puke out some of them here and you can pick and choose what you feel like addressing...if anything (you're very generous to do this in the first place).
To give some background, I (one man operation) currently trade 8 fully automated models with holding times from 5 mins - 5 days (avg. trade is something like 2.5 hours). I do this using IQFeed, Amibroker, and IB. My daily symbol universe is around 1000 equities. Amibroker is not event driven and it takes ~1.9 seconds to loop through and check all symbols for signals from the models. There is also a 1 second buffer between loops, so all told, each symbol is checked every 2.9 seconds. I realize that would get me laughed out of the room in HFT circles, but for my purposes thus far, it's been just fine...and very robust. However, I feel like I'm running out of low hanging fruit in the medium frequency space...and frankly, am getting a bit bored.
As I've started to look into some higher frequency ideas, it's become clear to me that every little thing is a great deal of work; therefore, as I build out infrastructure I want to make sure that it's not something I'm going to outgrow a few months after it's finished. It would be slightly generous to describe me as a programming hack (am productive with enough googling but have massive knowledge gaps)...began as an excel/vba expert and have since gotten a few things done in java, mysql, etc.. I enjoy the developing/programming side of things, so having some growing pains along the way wouldn't be the end of the world, but I do want to keep the programming as simple as possible.
Here are my goals and/or what I think I'll need (could be way off):
1. Be datafeed and broker agnostic...really want no 3rd party reliance at all (e.g. if Amibroker disappeared tomorrow, I'd be screwed)
2. Ability to generate signals on large universe (1-3k symbols) w/o major latency
--Would this have to be written in C++?
--Is a high end datafeed necessary...NxCore...direct from exchange? Cost?
3. Create a tick database
-- Would i just save text files to my hard drive and then back them up online..or use an actual DB?
-- Again would probably shoot for ~3k symbols...source for the data?
4. Backtesting
-- I'm guessing the best way to do this would be to read the data from the database into memory and then run backtesting logic. Recommended language for this?
5. A cheaper broker
-- Right now on a good month, I'm somewhere near 1M shares. For retail, IB isn't bad at .002/sh but if I were to go with some of the higher freq strats I've sketched out, the min $1/trade would kill me (i try to eliminate idiosynchratic risk by trading tiny and frequent). Any suggestions?
6. Proper hardware configuration (currently have i7 950 w/ 24GB RAM..and a laptop)
-- Can probably use my current machine for backtesting/development...would i rent something colo'd for execution? Is that secure? Cost?
I have a whole bunch of questions as I look to transition from a medium frequency retail operation to something a bit more sophisticated. It's certainly not fair to expect you to answer all of them, so I'll just puke out some of them here and you can pick and choose what you feel like addressing...if anything (you're very generous to do this in the first place).
Sure no worries – Happy to help or at least attempt to give help/honest answers. If I can’t help or don’t know I’ll say so.
To give some background, I (one man operation) currently trade 8 fully automated models with holding times from 5 mins - 5 days (avg. trade is something like 2.5 hours). I do this using IQFeed, Amibroker, and IB. My daily symbol universe is around 1000 equities. Amibroker is not event driven and it takes ~1.9 seconds to loop through and check all symbols for signals from the models. There is also a 1 second buffer between loops, so all told, each symbol is checked every 2.9 seconds. I realize that would get me laughed out of the room in HFT circles, but for my purposes thus far, it's been just fine...and very robust. However, I feel like I'm running out of low hanging fruit in the medium frequency space...and frankly, am getting a bit bored.
OK I can’t cure the boredom but there is always YouTube… or redtube... Not sure if you are married or not but you could always have kids – I hear they are expensive and exciting.
There isn’t much low hanging fruit left anywhere so I would consider yourself lucky to have a profitable strategy. Most people (I mean traders) aren’t generally open or honest but the reality of the space is that it is a pretty lonely career and you don’t make nearly as much money as the general public assumes you do. The more strategies you build the more you will want to keep it to yourself and the fewer people you will trust, etc. and it spirals until you become part of the paranoid ZeroHedge group.
It will always be boring – the space is lonely. You have to accept that as part of the job. Low Latency HFT/ATS is even more boring and isolated than what you are dealing with now.
This is where those HFT/ATS misconceptions come into play. You have an automated strategy (not even sure if it is fully automated or not) but the frequency thing is totally subjective so to one person you may be a HFT guy and to another you might be one step above a point-and-click monkey. As long as your bottom line is green more days than it is red don’t concern yourself with a classification and focus on improving your existing strategies vs. trying to go “bigger” by getting into HFT.
One question – Are you OK with the ~1.9 second loop and 2.9sec per-symbol check? Do you think it needs to be faster or are you asking me/us if we think it should be faster?
As I've started to look into some higher frequency ideas, it's become clear to me that every little thing is a great deal of work; therefore, as I build out infrastructure I want to make sure that it's not something I'm going to outgrow a few months after it's finished. It would be slightly generous to describe me as a programming hack (am productive with enough googling but have massive knowledge gaps)...began as an excel/vba expert and have since gotten a few things done in java, mysql, etc.. I enjoy the developing/programming side of things, so having some growing pains along the way wouldn't be the end of the world, but I do want to keep the programming as simple as possible.
This is good that you are honest and realistic. Most people (including myself) think/though that you can just sit down and learn a language. This simply isn’t the case. There is a big difference between using brute force and writing something with elegance and efficiency.
Here are my goals and/or what I think I'll need (could be way off):
1. Be datafeed and broker agnostic...really want no 3rd party reliance at all (e.g. if Amibroker disappeared tomorrow, I'd be screwed)
You are on a retail platform – you have chosen pretty good service providers but still retail. You are always going to have some of that around because they didn’t make up the saying “you get what you pay for” for nothing.
Your system is already broken up into three parts which is an excellent starting point. This is something that most people struggle to comprehend – he (or she)… meaning BT the poster… buys data from IQ Feed, points that data into Amibroker and then decides what to do. He (or she) then enters orders into Amibroker which are sent to Interactive Brokers to execute. So Data, OMS (order management system), Execution.
On the institutional side or in “big boy” trading most shops will say things like (using this example) “I execute Amibroker and clear IB” meaning you enter orders into Amibroker and you execute and clear your trades through Interactive Brokers. Data is fairly generic. There are tons of vendors out there and generally speaking they all sell you a bunch of BS pitching that they are the fastest or the best, etc.
IQ Feed:
For what it’s worth, I spent about 35 minutes on a support chat today with IQFeed and then another 35-40 minutes with a guy named Jay at IQFeed. It was pretty awesome to have a data provider be that open and honest about their latency, data sources, price structure and data integrity. I was very impressed – I haven’t tested their data but I have no reason to doubt what he said. For what you pay it seems like an amazing deal and for me as a random nobody to call up and get answers on questions like latency between their NYSE feed and relay servers or what they do in the event of dropped source packets – that’s rare at the institutional level. The experience was much closer to a top-tier institutional data provider so for $65/month + exchange fees you have a deal.
With regards to redundancy or being data provider agnostic – try pulling TCP quotes from IB as a backup solution.
IQ Feed stated that they keep their quote servers in Nebraska so they estimate that there is ~20ms of latency between the data source and their servers – and then add on whatever extra latency there is between you and their relay quote servers. I personally think that ~20ms is on the high side and that they were again, being totally open and honest – which is good. Depending on the feed I think they are probably closer to ~15ms on average but I don’t know enough to really state that with confidence. The interaction between the human eye and the brain takes approximately 15 milliseconds. That means that if I were to stand in front of you with a paint ball gun and shoot it at you – it would take you ~15ms from when your eyes see me pull the trigger to when you understand that you need to flinch. I was pleasantly surprised with IQFeed overall and I’d say you hit a home run there with service vs. pricing.
I sent a general inquiry to Amibroker today. I know nothing about them so I’ll have to come back and update when I learn more. They replied to me and pointed me to various knowledge base articles and forums – I honestly might not have the time to learn about how good/bad/neutral they really are.
My initial (uneducated) impression is that IQFeed is solid and that you either have issues with data handling/processing (the data gets to your machine fine – you just can’t process it fast enough) or that your OMS (order management system – Amibroker) is slow. Is the 1-second delay something you did or something that is an Amibroker default?
Also – your hardware may be old or not configured properly – but I don’t think you have any issues with data or data delivery.
2. Ability to generate signals on large universe (1-3k symbols) w/o major latency
IQ said they limit their base universe to 500 or 800 stocks (I forget) and that you can pay a slight premium to bump that up to 1800 symbols. If you want 3k symbols you probably need a big-boy data pipe… 1500 and you’ll probably start to max out IQ and don’t complain if things get slow or hang between 1500-1800 because you are maxing out something that’s meant to be a retail solution not an enterprise level data pipe for low latency HFT.
Could you swing NxCore? Change brokers or look at a different data feed? Personally I think your issue is with hardware and OMS so expanding your stock universe would be foolish at this point.
--Would this have to be written in C++?
I don’t know enough to answer this properly – and it might start to get to the upper end of my knowledge. That said, it all depends on the volume of data you receive vs. the latency. If you pull 50k stocks that trade OTC for a combined volume of 30million shares traded per day – you probably can handle it. On the other hand you may crush your system by pulling the 10 most active stocks on each exchange in the USA. The volume of data you receive is more important than the number of stocks you scan.
Generally speaking, a C++ data handler will be better than a C# (or other) data handler. Network cards also have a lot to do with this as well.
--Is a high end datafeed necessary...NxCore...direct from exchange? Cost?
In your case and at this point (with very little knowledge and understanding of your strategies) probably not… I was really impressed with IQ and “the devil you know” is very true. Best to know your situation than to be sold snake oil by a used car salesman. Equity data is generated in NYC (really the NJ data centers) and then shipped out to the Midwest to IQ’s relay servers. From there it goes to you – wherever you are. You need to balance that latency (being colo at IQ in Nebraska would get you fastest data but slower execution to IB in CT) vs. the bigger picture.
NxCore can give you a better data feed (server in NYC) but at a premium. That would help because you could put a machine between IB and NxCore – but I don’t think you need that at this point since you are already having data handling issues. (again I don’t know much)
3. Create a tick database
The IQ guy said today that all that information was stored locally on your machine and that you can recall it at anytime. Are you trying to optimize or backtest realtime or just nightly/weekly/monthly? Do you have space?
The best thing you can do is sync your machine’s time server to whatever time server your data provider also timestamps with. (I just sent an email to ask them, I’ll post here when I find out)
Backtest your current live strategy… live executions from today and tomorrow – against the data you record daily. That will give you a very good idea of your realistic P&L. Don’t try to tweak the backtest to make sure the numbers match up – just make sure that the numbers are consistent… for example, if you always backtest 25% better returns than IRL (in real life) **AND** you can be confident in that number because it is consistent then you know when a strategy will work or not. It has less to do with the backtest’s P&L being positive or not and everything to do with the backtest being consistent as compared to real life returns. Record tomorrow and run live tomorrow – backtest tomorrow and compare – if you lose $50k tomorrow on backtest but broke even, and it is consistent over several days/weeks then you know what your metrics are.
I have a massive tick database repository that I am working on getting clearance to make it public (legal stuff). I’ve approached an open-source vendor and they gave me a terrible response (over money) so I’m working on that. You may have an answer to your tick database shortly – but it is hard because it is a lot of data.
-- Would i just save text files to my hard drive and then back them up online..or use an actual DB?
You are going to want to use a database and have to convert the files into a format that can be read by your system. It is pretty complicated – but nothing crazy.
-- Again would probably shoot for ~3k symbols...source for the data?
IQ isn’t built to push more than 1800 names. Bloomberg… NxCore… At 3k symbols you start getting into “gotta pay to play” space. Regardless of latency you are going to be pulling a ton of data and using a ton of bandwidth. Activ…
Worry about a few little improvements on the current system first. You can be more profitable as-is and in my experience changing data providers or even latency can totally throw your whole system off balance.
4. Backtesting
-- I'm guessing the best way to do this would be to read the data from the database into memory and then run backtesting logic. Recommended language for this?
Limitations quickly become related to bandwidth and hdd/cpu/ram I/O more than language. Write it in whatever is easiest and then ask a bunch of questions before you start buying every 3TB HDD Newegg is selling. Managing a data array that large is painful. My largest is 89TB and growing. I have paired 10G Ethernet ports (20G total) and it’s still a pain to work with an array that large.
Buy a $200 or $300 hard drive (2-3TB) and record a few weeks of data. Play around with backtest just to do it so that you understand. Record your symbols that you trade and then backtest whatever strategies you ran that day against your historical data. Things are so time sensitive that you really need to understand your data vs. your system not just the results and backtest vs. actual performance.
5. A cheaper broker
-- Right now on a good month, I'm somewhere near 1M shares. For retail, IB isn't bad at .002/sh but if I were to go with some of the higher freq strats I've sketched out, the min $1/trade would kill me (i try to eliminate idiosynchratic risk by trading tiny and frequent). Any suggestions?
There are tons out there – it will become a chicken vs. egg game. Have you looked at cutting other costs first – hardware, software, data, provide vs. take liquidity, etc.?
I would say that you would be better off developing a trusted backtest system and a tried & true methodology than worrying about rates and bouncing around. Everyone I’ve ever known to switch jobs for “short money” (like a $5k bump on a $100k base) has regretted it. The only reason why a trader switches firms is because he/she is losing money. If you are making money your firm will accommodate. I know IB is retail but there are other great retail shops out there (portfolio margin) that won’t take you to the cleaners… but you need the volume first.
I’d say you are better off to build out your strategies and break even for a few months while your volume increases than you are to just jump ship and start from scratch. Even if you find the holy grail and you end up paying IB $1k/day in fees… at least you can document it and you have established that you can increase the volume without losing money.
Transition is the hardest thing – latency changes on both ends (orders & data). It’s like starting from scratch. It is terrible because what worked last week no longer works – ramp up at IB, open your new account, transition slowly, use demo accounts as long as you can, etc. record data from your demo accounts, backtest your new data vs. your existing strategies, compare, etc.
Changing firms is a disaster. IB is what it is (personally I don’t use them) but “the devil you know”.
6. Proper hardware configuration (currently have i7 950 w/ 24GB RAM..and a laptop)
You could probably run your entire operation on a dual-core 2.66ghz with 3GB of ram and a decent server-grade network card with 2/4 ports – if you ran Linux Server and used it as a dedicated machine.
If you want to talk hardware I’m happy to start a new thread – I have no idea what HDD or OS you have or how your OS is configured, etc. or what NICs you have, etc. etc.
99% of the time hardware configuration is a main source of improvement. Second to that is network & firewalls, then connectivity (no idea who your ISP is).
The other thing is how much other crap do you run? How many monitors… youtube, skype, AIM, google, itunes, porn, chats & forums, etc. etc. streaming TV, radio, etc.
The vast majority of times people have hardware that is 200-500% overkill but they have it setup wrong or just run so much crap on a single box (that’s not setup properly) and it drags the whole system down. If you want to talk specifics I’m happy to help on the hardware end – don’t post all your specs & what software you run or your ISP… too many crazies out there these days. If you want to start a generic thread and post your general HW specs and how best to optimize that’s different. I’m happy to do either.
-- Can probably use my current machine for backtesting/development...would i rent something colo'd for execution? Is that secure? Cost?
You can 100% use what you have to do everything you need. I started selling/renting VMs (virtual machines) to traders as a “low latency solution” and what it is evolving into is actually they want to rent “internet” or “porn” VMs where they can keep their AIM/Google/Skype/Chat, email, etc. open 24/7/365 and what I am finding is that most people already own hardware that is overkill for any of their trading needs – it’s just the extra crap that they run that slows the whole machine down. A lot of guys who started off renting machines in NYC who wanted to be closer to NYC for execution purposes realized that it was easier to keep source code and execution on their machine (because it is consistent) and just outsource the extra crap that bogs down your machine.
This isn’t great news for me because my selling point was low-latency not hiding your porn browsing from your wife… but it also says a lot in that the individual trader usually has the physical hardware and processing resources in their possession to implement an ATS.
If you want to rent a porn VM I start them at $25/month and they go up to $100/month. Windows 7 with 2-4gb RAM and 2-4 cores at 1.8-3.6ghz clock speed. Outside that I don’t think you would benefit from being in NYC. I think you have excellent data coming in, you need to make a few changes to your local machine but overall your improvements will come from your data handling and OMS and to your local hardware optimization.
Try using your laptop for a few days to chat/AIM/IM/Skype watch porn, etc. See if you notice a difference on your main box. That’s an easy step one.
Sorry it took so long to reply. Things have been busy and it took a while to do a bit of research to respond properly. Also sorry for such a long reply. Hope it helps in the end. The biggest misconception about HFT or low latency automated trading is that being faster makes you more money. If you have something that works stick with it but it isn't really an arms race - the better way to describe it is more a higher barrier to entry. You have the low hanging fruit in your space... it costs more to get the low hanging fruit in a faster paced environment but you will probably also make more.
Optimize what you have and solidify your backtesting methods before you look to start new strategies.
Another comment about the boredom - go on Bob's chat sometime. Everyone on there is trying to trade (manually) and I'm commenting about the hot mom parking her car outside or the dude with one leg walking down the street with 3 little dogs in a baby carraige... (true story). Its boring... You stare at a screen and that's it. I don't watch stocks or pay attention to the markets - half the people on his chat don't like me becuase they think I post spam and all I do is chit-chat. It will never be exciting and the better your strategies do the more private and secretive you will become.
Not what its cracked up to be.
Ask away – happy to help in any way I can.
Cheers,
Tom
Wow, thanks Tom - you really went above and beyond here...I know you're busy and contacting IQ and AB is off the charts.
I want to make sure I put together a thoughtful post that your response deserves, but unfortunately right now I'm trying to get to the bottom of a BSOD. Computer restarted right in the middle of trading today - second time it's happened during market hours this year...first time was semi-disastrous, this time (luckily) it didn't really matter. Honestly, stuff like this is my number one reason for considering a cloud solution.
Anyhow, just wanted to say thanks and that I may not get back until tomorrow depending on how this goes.
Man, this BSOD came at a terrible time...I haven't started my xmas shopping yet and am now terrified to leave the computer during market hours. Anyhow, if it's okay with you, I was thinking that I might respond to your posts in pieces. With all that's going on, it may be easier to quickly jot down a reply with a few spare minutes between other tasks...plus, there a few things that I'd like to reflect on a bit more before addressing. I will start witht this:
I agree with your characterization of the business. Ironically, some decent models can really take the fun out of it...once the chase and fear of failure is mostly gone and the babysitting/maintenance starts, the job becomes something entirely different. What you see in my initial post is symptomatic of that...every once in a while out of the need for intellectual stimulation and to feel like I'm productively building something, I'll convince myself that wholesale changes are needed to get to the next level. Sure I believe that done right these things will add to bottom line as well (and there's always the delusions of grandeur that come along with the unknown), but as you correctly point out, the easiest bump in pnl will come from improving my currents strats and/or set up. So gaining some ideas and perspective from your reply, I think my upcoming comments will be geared towards doing that...but I'm going to try to do it in a way where I will be building tools that will also be useful in adding some higher frequency strats.
One question unrelated to trading...this scared me "don’t post all your specs & what software you run or your ISP… too many crazies out there these days"...would you mind elaborating on what hardware/software details are dangerous to share? I'm guessing that the answer may be surprising and others on the board would find it highly valuable as well.
Pages