Log in

No account? Create an account

Previous Entry | Next Entry

POP from GMail vs. IMAP from K-State CNS

If you follow my posts about computer problems, you know that the reason I've continued to use Hotmail is that it downloads messages (via HTTP) to multiple computers running a client (Outlook Express or Windows Live Mail Desktop Beta). This is something one can do with Internet Message Access Protocol v4 (IMAP4), which Kansas State University's Computing and Network Services (KSU CNS) uses, but not with Post Office Protocol v3 (POP3), which GMail and Yahoo both use. With POP, once some client has downloaded a message, it is flagged as "retrieved" and other mail readers cannot download it again. With IMAP, you can set your client to download headers only, and you can have several clients running without too much contention for the Inbox file on your mail server.

E-mail requirements and constraints

This seemed to present a problem for me: I get about 20000-30000 messages per year, amounting to 1Gb of mail with automatic spam filtering or 600Mb with additional spam filtering by hand. This is likely to grow to 1-2Gb a year as attachments keep bloating. Meanwhile, full text searching (which K-State's Webmail does, but which I generally don't need) is getting slower and slower. What I would like instead is very fast header search (100000 messages in no more than 60 seconds). More important, I need to have direct access to those 20K messages a year on each system (it's a gigabyte a year, but I have 200-300Gb drives now in my desktop PCs and 100Gb ones in my notebooks). I shuttle back and forth between my campus and home offices during the week, and rotate among several home systems at different times of day. Thus far, it's been "IMAP or bust".

On top of this are my WinXP and application stability issues, and the fact that I was spread out over three applications: Mozilla Thunderbird, Microsoft Outlook Express (or Live Mail Desktop Beta), and web interfaces to GMail, usually accessed from Mozilla Firefox. I previously ran Microsoft Outlook Express (MSOE) versions 5-6 and Microsoft Outlook 2003 to access my Hotmail accounts (hsuwh and rizanabsith), but there were problems. First, MSOE was slow on sending mail and did not do TLS authentication, which KSU CNS requires for its outgoing Simple Mail Transfer Protocol (SMTP) server, auth.smtp.ksu.edu. Second, Outlook was slow in retrieving mail and did not do header search as quickly as MSOE; also, I sometimes encountered a bug in Outlook XP where the ability to search message headers would spontaneously go away forever (or until Outlook XP was reinstalled and reactivated). Moving my dozens of mailing list subscriptions would have been a very severe nuisance in and of itself, but moving my several hundred product registrations (all done using hsuwh@hotmail.com) would have been completely infeasible. As for application integration: you may know that I run two browsers (Mozilla Firefox, currently v1.5.0.6, and Mozilla Navigator, currently v1.7.12 and 1.7.13) in order to keep two "remember me" login spaces and cookiespaces without logging out. Add to that MSOE and Thunderbird at the same time, and you have a severe tax on GDI resources: I have 2Gb on Hirilonde and usually get over the halfway mark on commit charge, and my other systems all have 1Gb or less (512Mb on Tulkas, 768Mb on Telperion). This has also tended to put the last nail in my instability coffin, causing Firefox to crash 1-5 times a day on Hirilonde, taking WinXP Pro with it in a Blue Screen of Death (BSOD) every few (3-10) crashes.

So, to summarize: I needed to fetch mail from two GMail and two Live Mail (formerly Hotmail) accounts to each of six systems, without resorting to a big IMAP kludge, and I needed to do it all from Thunderbird so my PCs won't get (more) overloaded and unstable.

Enter Project Angainor.

Project Angainor: the Chain of GMail

The facts:

  • GMail has POP and forwarding.

  • One option in forwarding is to leave a copy of the message on the original account.

  • One option in POP service is to delete the copy of the message from the account from which it was downloaded.

  • GMail's spam filtering is excellent (high in both precision and recall), even now that spammers are trying.

  • Google generously provides users with several invitations, even on "younger generation" accounts.

  • I need to fetch mail from banazir@gmail.com, my spam-free work GMail account, hsuwh@hotmail.com, rizanabsith@hotmail.com, and possibly later hsuwh@yahoo.com and rizanabsith@yahoo.com.

  • I need to see all messages from the above accounts from one mail client (Thunderbird) running on each of six systems: Hirilonde, Laurelin, Osse (my office system), Numerramar, Telperion, and Tulkas.

  • I don't need to keep the fetched copies around forever as long as there is one folder that holds everything.

The scheme:

  • 1. I sent myself five GMail invitations.

  • 2. I created BanaLaurelin, BanaNumerramar, BanaOsse, BanaTelperion, and BanaTulkas.

  • 3. I went to Settings -> Forwarding and POP in each GMail and forwarded banazir and the spam-free account to BanaLaurelin, BanaLaurelin to BanaNumerramar, BanaNumerramar to BanaOsse, BanaOsse to BanaTelperion, and BanaTelperion to BanaTulkas.

  • 4. I marked banazir and the spam-free account "keep copy in Inbox" and set Hirilonde to fetch from them.

  • 5. I enabled POP for all messages and checked "when accessed from POP, delete from GMail" on Laurelin, Numerramar, Osse, Telperion, and Tulkas.

Voila, instant IMAP-like functionality. Each system fetches from its "tap" (Hirilonde from banazir and the spam-free account, Laurelin from BanaLaurelin, etc.). Each one sends out from the SMTP associated with its tap but has replies directed to banazir@gmail.com. The spam-free account remains inviolate.

There's one thing left to do: I still need to forward Live Mail and Yahoo Mail to GMail receiverships (or folders on the spam-free account).

It's not such an elegant hack, IMO, because it's tremendously wasteful of bandwidth, but it works, and it does something I've tried for almost two years to do, without success.

What do you think?



( 9 comments — Leave a comment )
Sep. 7th, 2006 09:32 pm (UTC)
I think I am a bit confused, but if it works then that's awesome. :D
Sep. 7th, 2006 10:41 pm (UTC)
Oh, it works all right
See the diagram I just put in. There's no delay that I can detect, between the time something arrives at banazir (one root of the tree) and when it arrives at BanaTulkas (the leaf, or the end of the lower chain).

Sep. 7th, 2006 09:33 pm (UTC)
I fail to see how it's wasteful of bandwidth if it's accomplishing what needs to be accomplished. If the only way to accomplish it in a reasonable amount of time is to do it the way you've done it, then that's the nature of getting as much mail as you get. One day when the pipes for internet2 are laid down, it won't even matter, in which case the elegant solution is the one that takes advantage of all of the options you've used to solve your problem. It's like an example of distributed computing, which by its nature is elegant.
Sep. 8th, 2006 01:13 am (UTC)
N-fold local replication
Well, since I'm downloading entire message bodies, this means that for N clients (here, N = 6), I'm transferring every message 2N - 1 times (N - 1 forwards and N downloads) and storing it N + 1 times (on the local file system of N computers, plus Google). As you say, though, there isn't much of a choice with POP.

Sep. 7th, 2006 10:26 pm (UTC)
Since I am a bit scattered right now, I had to read through it twice to get how everything worked. However..

.. I think it's brilliant, and excess use of bandwidth be damned. I don't get nearly as large a volume of e-mail that you do, but I can understand the complications that would come from opening up the mail client and being besieged by everything. Your solution to the problem is in two words, pure awesome.

(and the BSOD is evil, isn't it? I don't get it much anymore now that I've abandoned online gaming for more study time, but it always comes when one does not want it to which is the nature of errors, but I digress. As I mentioned before, I'm scattered, and incoherence comes with that certain territory :D)
Sep. 8th, 2006 09:31 am (UTC)
Thanks. I just needed a consolidated receivership, as my mail was getting fragmented across half a dozen services.

As for the BSOD, it's precisely because I'm getting it even with professional apps alone that I feel I need to stabilize things.

You've got me curious: whence the scattering? :)

(Deleted comment)
Sep. 8th, 2006 02:00 am (UTC)
Marking POP-downloaded messages still retrievable on the server
I tried to do this in Thunderbird and I couldn't find a way - if I got six messages and downloaded each in turn on a different system above, there would be just that one message on each PC, even though the GMail folder still had 6 messages in it.

Are you saying it's possible to keep the connection open, or will it be unfetchable after some timeout period? Did I miss a setting in Thunderbird or GMail?

(Deleted comment)
Sep. 8th, 2006 02:53 am (UTC)
GMail POP behavior not standards-compliant with POP3
Now I understand. And hence my brute-force workaround. How did Gilmore's saying go?

"The Net interprets censorship as damage and routes around it."

Users interpret aberrant functionality as damage, too. ;-)

Sep. 8th, 2006 05:03 pm (UTC)
What you did is very clever.

But, I think that you are better off running your own imap server with a provider. Though it costs some time and money, you can have a lot of control over the environment and its settings.

This way you can have an online backup of the mails you recieve too, which can be accessed from any machine via a client of your choice.
( 9 comments — Leave a comment )

Latest Month

December 2008

KSU Genetic and Evolutionary Computation (GEC) Lab



Science, Technology, Engineering, Math (STEM) Communities

Fresh Pages


Powered by LiveJournal.com
Designed by Naoto Kishi