Discussions API; comments_count field is misleading

Matthew Haugen's Avatar

Matthew Haugen

21 Mar, 2016 06:51 PM

Hey there,

I opened a duplicate discussion last week, but unfortunately never heard back. I wasn't sure whether this was an artifact of not having signed in (I see 0 people are watching it, so perhaps it got disregarded), or if you're just busy. Sorry for the duplicate, no rush, I just wanted to make sure this was somewhere in the queue :).

I was just reviewing some code I've got for doing an incremental backup of discussions we've got, and I noticed a hugely inefficient check I was doing. My code looked something like this:

For each discussion in each page of discussions
    Fetch pages (?page=x) of comments until "comments" attribute is null
        Report comments

The inefficiency here comes because I have to pull page=2 even if page=1 had all of the comments. So, for every discussion, I had to make two calls!

My solution to this was simple: add a condition for the second line to only pull the next page if my count of reported comments was less than the comments_count field.

However, I pretty quickly identified that this field excludes internal=true and system_message=true comments.

For a page size 4 (deliberately incorrect), my code would break in the following discussion:

  1. Customer: hello
  2. Agent: hey
  3. Agent (internal): help
  4. Agent (internal): no
  5. Customer: what's up?

A thrilling dialogue, but my algorithm would ignore the last message. comments_count would be 3 (reflecting 1, 2, and 5), so I'd only read one page.

My solution to this was to only count non-internal, non-system messages for my condition. That fixes this case, because 2 < 3 for the first page, so I read a second.

However, being the edge-case fanatic I am, this doesn't address a case: switch comments 4 and 5, so the last page is filled with only internal (or system) messages, and they'll always get ignored.

It's exceedingly rare that the actual page size (what is it, 30?) would be filled with comments, then just the 31st comment would be internal, but it could happen, and I'd hate to miss that data-point.

My next attempt was to use the last_comment_id field, but as far as I can tell, that doesn't correspond to anything in the comment data structure--number is very different, and it doesn't appear to be a concatenation including the discussion number, unless I'm missing something.

I could say that, for a given page size x, if we find x or more comments on a page, we should read the next page. But I'm worried because I can't find strict documentation of how many will be included, and it doesn't seem like this is something that should be hard-coded in anyway.

I feel like I haven't explained this well, so feel free to ask for clarification, but do you have any advice on a compromise?

Ideally, I'd like to avoid multiple unnecessary calls (for your sake and mine), but I also don't like having this, albeit subtle, hole.

  1. Support Staff 1 Posted by Courtenay on 23 Mar, 2016 04:19 AM

    Courtenay's Avatar

    Hi, sorry for the delay.

    I think the simplest thing is a variant on what you suggested - for page size x, where it returns x comments, retrieve another page. This will only result in a second call in a very small number of cases. The page size is 50 for discussion comments.

    While I can appreciate the desire for perfection here, unfortunately this comments_count is entrenched pretty firmly in the API involving multiple caching layers, db fields, etc, and would involve a whole dev cycle to modify; and we have much more exciting stuff to focus on for what can be solved with a client-side numeric check. :)

  2. brandi closed this discussion on 26 Apr, 2016 12:57 AM.

Discussions are closed to public comments.
If you need help with Tender please start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac