Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a new endpoint to return a matched CDXJ record #473

Open
ibnesayeed opened this issue Aug 7, 2018 · 8 comments
Open

Create a new endpoint to return a matched CDXJ record #473

ibnesayeed opened this issue Aug 7, 2018 · 8 comments

Comments

@ibnesayeed
Copy link
Member

We need a new endpoint to return an index record instead of a reconstructed memento. This will enable us to try fetching IPFS blocks directly from the SW and reconstruct the memento there instead of letting the server do this. This will be a step in the direction of server-free decentralized replay. This will eliminate the need of threading as we can leverage asynchronous nature of JS for concurrent fetches. Additionally, we can avoid the location header rewriting issue (as per #456 and #461) by reusing the logic already present in Reconstructive.

/cdxj/:datetime/:urir should return 404 if no record is found, but return 200 (not a 3xx) otherwise with the one entry extracted from the index. We can either return application/cdxj+ors content type or application/json if we transform the index record into JSON.

@machawk1
Copy link
Member

machawk1 commented Aug 7, 2018

@ibnesayeed This seems like the goal you described for server-less replay it would make achieving #434 even more difficult. Please comment on this, as I am motivated of integrating Prefer support as is relevant to my (our) research.

@ibnesayeed
Copy link
Member Author

No, the two are independent things. We can continue implementing support for Prefer header for raw memento. However, with client-side IPFS fetch, we will not need either of the raw or rewritten mementos as we will be performing the composition on the client side directly. Those accessing the server without the SW in place would need to talk to the regular memento endpoint.

@machawk1 machawk1 self-assigned this Aug 7, 2018
@machawk1
Copy link
Member

machawk1 commented Aug 7, 2018

@ibnesayeed Do you think the CDXJ meta headers should be included in the response?

I am working on an implementation that leverages our currently existing functionality and want to be sure I route through the right functions so as to not have to duplicate functionality.

machawk1 added a commit that referenced this issue Aug 7, 2018
@ibnesayeed
Copy link
Member Author

Do you think the CDXJ meta headers should be included in the response?

We can think about that later when we start consuming the response. We might just create a JSON object that has all the necessary bits from the matched record and any other necessary metadata in it.

@machawk1
Copy link
Member

machawk1 commented Aug 7, 2018

...just create a JSON object...

Per our verbal discussion, please outline how you expect this JSON object to look, e.g., including all the Memento-esque relations. Just an example ought to get us moving in the right direction to make this endpoint more useable for the replay banner.

@machawk1
Copy link
Member

machawk1 commented Aug 8, 2018

@ibnesayeed Please document here the alternative Prefer semantics you described to me verbally in lieu of having a CDXJ endpoint.

@ibnesayeed
Copy link
Member Author

I think we are looking for something like this:

$ curl -i -H "Prefer: return=minimal" "http://localhost:5000/memento/20140115101500/memento.us/"
HTTP/1.0 200 
Preference-Applied: return=minimal
Content-Type: application/json
Memento-Datetime: Wed, 15 Jan 2014 10:15:00 GMT
Link: <http://memento.us/>; rel="original",
 <http://localhost:5000/timemap/link/memento.us/>; rel="timemap"; type="application/link-format",
 <http://localhost:5000/timemap/cdxj/memento.us/>; rel="timemap"; type="application/cdxj+ors",
 <http://localhost:5000/timegate/memento.us/>; rel="timegate",
 <http://localhost:5000/memento/20130202100000/memento.us/>; rel="first memento"; datetime="Sat, 02 Feb 2013 10:00:00 GMT",
 <http://localhost:5000/memento/20140114100000/memento.us/>; rel="prev memento"; datetime="Tue, 14 Jan 2014 10:00:00 GMT",
 <http://localhost:5000/memento/20140115101500/memento.us/>; rel="memento"; datetime="Wed, 15 Jan 2014 10:15:00 GMT",
 <http://localhost:5000/memento/20161231110000/memento.us/>; rel="next memento"; datetime="Sat, 31 Dec 2016 11:00:00 GMT",
 <http://localhost:5000/memento/20161231110001/memento.us/>; rel="last memento"; datetime="Sat, 31 Dec 2016 11:00:01 GMT"
Server: InterPlanetary Wayback Replay/0.2018.08.08.0200
Date: Wed, 08 Aug 2018 21:39:39 GMT
Content-Length: 272

{
  "surt": "us,memento)/",
  "datetime": "20140115101500",
  "locator": "urn:ipfs/QmbyEELu2DNagj4bvdxCb4N7XHeSQEupbEugXTqnQ6QBGE/QmXDsUhfSzvtTwakyt6McXnjpzAw2BQvAcVdSCWSp2Tfge",
  "original_uri": "http://memento.us/",
  "mime_type": "text/html",
  "status_code": "200"
}

@ibnesayeed
Copy link
Member Author

While Prefer: return=minimal might work here, but as per the specs there is no guarantee about what is expected from the server when a minimal representation is returned. Hence, if we want a more tight semantics defined here about what the client is expecting then we can use a custom preferences here such as Prefer: memento-variant=index (as discussed in a potential RFC extension discussion).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants