UDP Packet Frequency - Playing Well with the Server

Want to help out? Need help accessing the AniDB API? This is the place to ask questions.

Moderator: AniDB

Locked
Ommina
Posts: 3
Joined: Mon Sep 17, 2007 9:15 pm

UDP Packet Frequency - Playing Well with the Server

Post by Ommina » Mon Sep 17, 2007 10:19 pm

Hello all from the towering post-count of one.

I'm working on an UDP client, and expect to start with the UDPness (is too a word) by the end of the week.

It is, though, important to me to play nice, so I would like to get some clarification of the 'long term' flood protection rules.

As is typical, a first task of the client will be to create a cache of details concerning the user's collection. This has the potential to create a sizable batch of packets: call it one packet per file, plus one for each unique group and anime. I don't current expect posting to MyList to be part of the initial scan - this behaviour is already covered adequately by other clients.

Considering a modest collection of 2000 files (wild guess, I pulled the number out of the air), at one packer per 30 seconds, it's about 17 hours to work through the batch.

After the cache is built, things settle down as I only need to worry about new files. But even so, I'm somewhat concerned about that 17 hour number.

Not that I mind making the user wait that long - it's a one time event for them. But I don't want the server going "grrr!" at me for using resources over that length of time: understanding that it is a connectionless protocol, the server must still maintain state information for the full duration of the login.

So -- is maintaining a login for this long OK with the local powers that be? Do you have other recommendations / preferences for minimzing server costs? I'm willing to abandon the initial cache entirely if such is required, but it isn't ideal for my needs.

Many thanks!

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Mon Sep 17, 2007 11:36 pm

You probably want to read: Particularly the bits about the practicality of mylist mirroring through UDP syncage, and XML exports.

Rar

Ommina
Posts: 3
Joined: Mon Sep 17, 2007 9:15 pm

Post by Ommina » Tue Sep 18, 2007 12:55 am

Good reads! Thank you for your reply. (And let's me double my post count!)

My interest in MyList is actually secondary, indeed, tertiary, however. Presently, the only MyList functionality that I intend to offer is a "Mark Watched / Unwatched" toggle. While marking an item watched implies adding it to the list first, that's about as much as I want to worry about it. Changes made to MyList through any other means are not a concern to my client.

What I would like is basic anime / episode information for files found on the HDD directories supplied. (I do not expect to support files backed up onto optical media.)

While not a file renaming utility, it needs roughly the same amount of data from the server (a packet per file hashed, plus one per unique anime & group).

This saves me from having to keep MyList and uhm, my list, synced. But still leaves the initial creation of the cache as potentially expensive on the server.

Thanks again!

epoximator
AniDB Staff
Posts: 379
Joined: Sun Nov 07, 2004 11:05 am

Post by epoximator » Tue Sep 18, 2007 6:39 am

staying logged in forever is ok

User avatar
exp
Site Admin
Posts: 2438
Joined: Tue Oct 01, 2002 9:42 pm
Location: Nowhere

Post by exp » Tue Sep 18, 2007 7:35 am

What exactly do you mean with "caching" mylist data?

Do you mean hashing all local files and transmitting all their hashes to the db and obtaining data for each file in the process? (So you would end up with a cached list of anidb data for all files which you hashed, but not all in the users mylist) That would be an acceptable use.

However, if you're expecting to simply obtain data a copy of the full users mylist, you MUST NOT try to obtain it through UDP. Such use of the UDP api is explicitly forbidden, no matter how long you choose your delays.
The solution here is to use one of the xml-based mylist exports as initial data import. That way you get the users entire mylist data for your cache and can then start to work via UDP.
A planned addition to this is a just-in-time HTTP API for mylist XML data. Which would allow a client to simply download and parse a user's complete mylist for the initial cache buildup. This would be fast and efficient, no need to wait for a mylist import to finish or for 17h of one-packet-every-30-secs communications.

BYe!
EXP

Ommina
Posts: 3
Joined: Mon Sep 17, 2007 9:15 pm

Post by Ommina » Tue Sep 18, 2007 9:50 am

epoximator: Thanks! That makes me feel significantly better. The information provided through the API is invaluable, and it's important to me to not give the server fits.

exp: Definitely the first one. The only data in which I am interested is that pertaining to the files physically on the users HDD. I'm explicitly excluding any file that we would describe as 'external storage' here. The only data in which I have any interest is that to which the client has immediate access.

That said - I wonder if I can make good use of the HTTP API regardless. If the user has already updated their MyList through some other client, then the data will be there and I can discard entries on secondary storage. It also gets them up and going within minutes instead of days.

If their MyList is incomplete, well, then I'm no further behind than I am now.

So, I'll go with the UDP requests for the moment -- they need to happen anyway -- but will watch for the HTTP API release as well, in hopes of putting it to good use.

Once again, thank you all for your help.

titoum
Posts: 29
Joined: Mon Sep 13, 2004 10:48 am
Location: Plop

Post by titoum » Thu Sep 20, 2007 4:04 pm

for me (project still running :p)

i have a nice anidb list and i can add ep through my client, just need to make the query to get back the information about the file i just added and caching it.

now for massive hash, you should recommend to use aom but for addind 20-30 files go with your client => 30 * 5sec delay between each add + file info should be good, nop ?

MostAwesomeDude
Posts: 38
Joined: Fri Jun 01, 2007 11:02 am

Post by MostAwesomeDude » Fri Sep 21, 2007 6:25 pm

Why does this sound familiar?

Anyway, everybody else pretty much got it. While there's no penalty for a long login, the UDP API isn't for downloading massive amounts of data. At some point (soon?) in the future, mylists will be retrievable through a nice XML format. Until then, just try to only keep resident the parts of the DB that have been requested. (Don't load an entire mylist if you only need to view one anime!)

MostAwesomeDude
Posts: 38
Joined: Fri Jun 01, 2007 11:02 am

Post by MostAwesomeDude » Fri Sep 21, 2007 6:28 pm

Oh, and as for syncing, you should probably be keeping data indefinitely, with a user-controlled option to "refresh" the data. That's the best way to keep load off the server, and since there is no set limit on expiry for data returned by the server, you will save everybody a headache by just assuming that your last dataset is still good. The only other alternative is to not cache at all, and nobody wins that way.

~ C.

Locked