so, as you might have noticed, the site went down on wednesday. it was back up on wednesday night/thursday morning for a bit of time, before disappearing again until sometime late friday.
the entire ordeal was exhausting and nonsensical. i wrote about what happened on wednesday, ready to post it on thursday morning, but the site went back down before i could get it up.
so here it is, part 1 of the nightmare:
on wednesday, i posted a quick article and then, about 20 minutes later, roxy im’s me telling me the site is down. but its not down really — it’s throwing a 404 error. and i know, from past experience, that this 404 error (all formatted and junk) is actually coming from the host. great, they have misconfigured something somewhere.
the last 18 months or so, we have had some domain/host issues and i’ve ended up with 2 domains pointing to madeofglass. and it is times like this where it comes in handy. a quick check at the other domain shows that the files are there and the server is up. it’s just something with the actual domain resolving. ug.
i call support. ‘my site is down.’ i explain everything that i am seeing. the guy i am talking to is a fool. he tells me over and over ‘the server is off.’ seriously. i try to counter with the fact that i can still get to everything, that i can still ftp into the site. the server is not off; its just a misconfiguration.
he promises me that it will be fixed within 30 minutes and cannot get off the phone fast enough with me, avoiding all of my questions and insistences that the server is not, in fact, off.
of course, 30 minutes later, the site isn’t up. now, near as i can tell, it went down around 3.15. i spoke to this doofus at about 3.45. i call back at 4.30, knowing ill be on hold for a while. i am.
i speak to crs #2. this guy isn’t much better — he insists that the first person i spoke to didn’t actually open a ticket for me, but this new guy, he will do that. the site will be back up in no more than hour. well, fine. i clarify with this fellow that it is their fault. he confirms, which, actually, is unique in my experience. usually it is very difficult to get an admission of guilt.
5.30 rolls around. the site is still down and i call back. this time, i’m on hold for 45 minutes. 45 minutes. site is still down and has been for 3 hours now. it’s the end of the day, im sick, im tired and i want this fixed. and then along comes csr #3.
this might actually qualify as the worst customer service call i have ever had to deal with. i am a reasonable guy in these kinds of circumstances, but i generally don’t take a lot of shit.
i explain the situation to the guy, tell him i have a case number. he barely listens. he tells me that he will check on my site and will know something in 30 minutes. i tell him this is exactly what i have been hearing all afternoon. he lets me run with this for a good 15 or 20 seconds, never correcting me. then he says ‘i need to put you on hold for 3 minutes.’ ‘three?’ ‘yes, three. not thirty.’ oops. i misheard. i let him put me on hold, though he leaves me listening to their looped music for more like 8 or 9 minutes. maybe that’s spite, who knows. fine, i’ll take it as punishment for mishearing. he comes back on the line and proves he hasn’t listened to a word i have said — he tells me he can get to my site. i am surprised, as i am sitting in front of the computer, refreshing and continue to get the 404. i ask him what he sees. he describes a site to me. sadly, it is not mog.com. (tmi: the site used to run on asp and i’ve kept that version around. it exists on a different domain and a different package.) this guy wasn’t paying attention to anything i had said — if he had been, he wouldn’t have looked in the exact wrong place.
not an inspiring start.
he then continues by telling me the ‘server is off.’ as if this is an acceptable and perfectly natural explanation. this is where things start going a bit more wonky:
i tell him that this is not an explanation, especially since i can still get a response. he counters by insisting that they are moving data centers from new york to kansas and my machine has been turned off. (this, i believe, is true. it explains why hima noticed on monday that timestamps on the site are off by an hour. but, if this is the case, it means the servers were moved over the weekend. not yesterday.)
i tell him that this is unacceptable — it is a wednesday afternoon and simply pulling a site down without warning and then offering the excuse that it has been turned off isn’t the greatest example of how to please customers. especially when i am told over and over that it will be resolved within a few minutes and isn’t.
he then tells me in an exasperated tone that the other people i have spoken to today were lying to me. and that he is telling me the truth now.
seriously.
i explain, again, that the server isn’t off because i can still get to the files through this different domain. he says he can’t get in. i tell him that i am looking at the site in the browser.
‘oh, a browser. that’s why.’ what? ‘can you log into the server?’
i ask, ‘you mean through ftp?’
he counters with: ‘yes, that’s what we do here. we log into machines.’ uh. well, ok. and yeah, it turns out that i can’t get in through ftp.
he explains again that the server has been turned off. nevermind that the ftp error is ‘wrong login’ — meaning that the machine is responding. or that i am getting this 404 error.
i tell him that i just want to understand what is happening. it isn’t clear to me that the server is off based on the things i am seeing. i tell him that i just want a clear understanding of the issue. and this is the turning point.
he flips out, raising his voice almost to a yell, telling me that he has told me everything i need to know, that the story will not change and the site will be fixed in the next 24 hours.
i’m shocked by the tone he has taken with me. i have no raised my voice with him. i am positive that dealing with a frustrated customer can be exhausting, esp one who wants an explanation that perhaps one cannot actually give. but i never raised my voice and i was never abusive.
i tell him that there is no reason to raise his voice with me, that i just want to understand what is happening with my website, as the explanation given and issues i am having doesn’t jive with me.
he apologizes. and then refuses to tell me anything else. he tells me that the story will not change; he has told me all he is willing to and i will just have to trust him.
i explain that i don’t understand the situation and just want a better description. he refuses.
it’s clear now that i’m not going to get anything resolved — all i can do is wait. but i’m not letting this guy be an asshole to me. i ask for his supervisor.
he refuses. refuses.
i expected this — rarely can you get anything escalated. first he tells me that he will not transfer me. then he tells me that there is no supervisor. i ask him if he is unsupervised then. first he says yes, then he admits there is someone somewhere overseeing him, but they are on a different team elsewhere. i ask to speak to them.
at no point do i threaten him — in fact, i say that i just want to speak to someone who will give me the details i am looking for.
he refuses. i ask him if he thinks this is an example of good customer service. all i want is to understand. he tells me again that the server is off and i can do nothing but wait.
i tell him this isn’t good enough; i want to simply know what is going on so i can manage expectations for when the site will return.
he says nothing.
really.
we sit in silence on the phone for at least 30 seconds. i have made it very clear to him what i want. he, in turn, has decided the easiest way to deal with me is to say nothing. 30 seconds is a long time to sit in silence on the phone, esp when it is with someone who is supposed to be helping you.
i finally say, again, that i just want to understand what is happening.
his answer this time is ‘the admins will get in touch with you when your site is up again.’
i laugh. i tell him that he is now lying to me as well — we both know the admins won’t contact me. they didn’t contact me to tell me they were ‘turning off the server.’ they have never contacted me. he repeats this like a mantra.
i’m tired. i’ve been on the phone for over an hour. i just want an answer, one that is never going to arrive.
i ask for his name. he gives it me eagerly — ahmad. but refuses to transfer my call. i have no idea if it is a real name or not.
i ask again for an understanding of why i can get to the site from one domain but not another — this time, he gives me a garbage answer: ‘uh, i think the system is running some kind of raid setup and that would explain it.’ uh, actually, no it wouldn’t. at all. but whatever. it’s not going to go anywhere and i have better ways of wasting my time.
i end the call.
an hour later, the site is back up. except that even now, 18 hours later, i can’t ftp into it still. but at least you can read this. and i got an email from the hosting company asking me to take part in a cs survey. you can bet anything i will be sending the link to this post. it might not do any good at all, but someone ought to know.
of course, as i go to post this, the site is back down. lovely.
Dude, we could have this entire site running on a very fast network with a downtime less than five minutes a year (average over the past two years) in about 1/2 hour plus DNS propogation time. You could even have the same site running on two different machines sharing a DB with a round robin DNS in front of it. All for next to nothing. El Brett and I can hook you up with but an email :)
[...] so you heard about the first day the site was down. joyous. [...]