Fog with S3, public url and network requests

Discussion:

David

2014-10-24 07:27:31 UTC

Hello!

I have a small issue with this fantastic gem (great work!).

Here is a small script to get the public url of an object stored in S3:

require "fog"

# Fires one request
storage = Fog::Storage.new(
:provider => "AWS",
:aws_access_key_id => ENV["AWS_KEY"],
:aws_secret_access_key => ENV["AWS_SECRET"],
:region => "eu-west-1"
)

# Fires one request
d = storage.directories.get(ENV["AWS_BUCKET"])

# Fires one request
d.files.get("A.txt").public_url

As you can see, this script will fire 3 requests to S3.

Now, here is the same script but using the AWS sdk:

require "aws"

# No request fired
s3 = AWS::S3.new(
:access_key_id => ENV['AWS_KEY'],
:secret_access_key => ENV['AWS_SECRET']
)

# No request fired
b = s3.buckets[ENV["AWS_BUCKET"]]

# No request fired
b.objects["A.txt"].public_url.to_s

There is not a single request fired. I guess that the idea behind this is:
don't hit S3 until you really, really need to.

My main issue is the request fired to get the public_url of an object.
Let me explain it with an example: let's pretend we are building a rails
API backend for movies. Each movie is linked to a poster image which is
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the name of
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and then for
each object it will try to get the public url using the corresponding fog
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but not
with a reasonable large amount of Movie objects (let's say 100 => 100
requests to S3 have to be made to get the urls).

The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if it is
possible with Fog?

I know, I could build the public url myself without using Fog. I could get
the url of the bucket with public_url of the Fog::Storage::AWS::Directory
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of code
is coupled with how S3 objects are organised. I'd like to keep the code
"provider agnostic" as much as possible. If we change from S3 to another
provider, it's only a matter of storage configuration. That's why we are
using Fog instead of aws sdk.

Thanks in advance for any answer.

Regards,

David

--
You received this message because you are subscribed to the Google Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-fog+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Frederick Cheung

2014-10-28 14:00:39 UTC

Permalink

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)

shouldn't fire any requests although that generates a signed url

The other approaches do seem to validate things like whether the file exists and so on.

Fred

Post by David
Hello!
I have a small issue with this fantastic gem (great work!).
require "fog"
# Fires one request
storage = Fog::Storage.new(
:provider => "AWS",
:aws_access_key_id => ENV["AWS_KEY"],
:aws_secret_access_key => ENV["AWS_SECRET"],
:region => "eu-west-1"
)
# Fires one request
d = storage.directories.get(ENV["AWS_BUCKET"])
# Fires one request
d.files.get("A.txt").public_url
As you can see, this script will fire 3 requests to S3.
require "aws"
# No request fired
s3 = AWS::S3.new(
:access_key_id => ENV['AWS_KEY'],
:secret_access_key => ENV['AWS_SECRET']
)
# No request fired
b = s3.buckets[ENV["AWS_BUCKET"]]
# No request fired
b.objects["A.txt"].public_url.to_s
don't hit S3 until you really, really need to.
My main issue is the request fired to get the public_url of an object.
Let me explain it with an example: let's pretend we are building a rails
API backend for movies. Each movie is linked to a poster image which is
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the name of
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and then for
each object it will try to get the public url using the corresponding fog
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but not
with a reasonable large amount of Movie objects (let's say 100 => 100
requests to S3 have to be made to get the urls).
The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if it is
possible with Fog?
I know, I could build the public url myself without using Fog. I could get
the url of the bucket with public_url of the Fog::Storage::AWS::Directory
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of code
is coupled with how S3 objects are organised. I'd like to keep the code
"provider agnostic" as much as possible. If we change from S3 to another
provider, it's only a matter of storage configuration. That's why we are
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google Groups "ruby-fog" group.
For more options, visit https://groups.google.com/d/optout.

geemus (Wesley Beary)

2014-10-28 14:06:12 UTC

Permalink

Yeah, I think Fred hit the nail on the head.

directories/files#get is explicitly a call to fetch info, whereas #new is
simply a call to create a local reference

Similarly, the #public_url method on file (for better or worse) was made to
be cautious and accurate (so it checks that the file exists before giving a
possibly bogus url). In that case, if you drop down to the helper that
generates the actual URL it will also avoid the calls, which should
hopefully give you what you need.

Sorry for any confusion or lack of clarity there. If you have suggestions
about how we could better communicate that we would love to hear them (and
any documentation you might contribute would be awesome).

Thanks!
wes

On Tue, Oct 28, 2014 at 9:00 AM, Frederick Cheung <

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file exists and so on.
Fred

for

Post by David
each object it will try to get the public url using the corresponding fog
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but not
with a reasonable large amount of Movie objects (let's say 100 => 100
requests to S3 have to be made to get the urls).
The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if it is
possible with Fog?
I know, I could build the public url myself without using Fog. I could

get

Post by David
the url of the bucket with public_url of the Fog::Storage::AWS::Directory
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of code
is coupled with how S3 objects are organised. I'd like to keep the code
"provider agnostic" as much as possible. If we change from S3 to another
provider, it's only a matter of storage configuration. That's why we are
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

David

2014-11-03 11:24:01 UTC

Permalink

Hi there,

Thanks for you answers!

I think everything was clear as I understood that fog is more cautious than
the was sdk.
My question was simply if there was a way to get the public url without all
checks.

As advised, I used the underlying helper to get the url: I used #request_url
of Fog::Storage::AWS::Real

It's nice to know the difference between #get and #new. The only downside I
see with #get_https_url is that there is no way (from what I see) to get
the (non signed) public url.
In my scenario, I'm returning urls of public resources that can be cached
by some clients (other backends or js apps). It makes more sense in this
case to use simple (non signed) public urls.

Anyway, thanks again for your answers,

Regards
David

Post by geemus (Wesley Beary)
Yeah, I think Fred hit the nail on the head.
directories/files#get is explicitly a call to fetch info, whereas #new is
simply a call to create a local reference
Similarly, the #public_url method on file (for better or worse) was made
to be cautious and accurate (so it checks that the file exists before
giving a possibly bogus url). In that case, if you drop down to the helper
that generates the actual URL it will also avoid the calls, which should
hopefully give you what you need.
Sorry for any confusion or lack of clarity there. If you have suggestions
about how we could better communicate that we would love to hear them (and
any documentation you might contribute would be awesome).
Thanks!
wes

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file exists and so on.
Fred

Post by David
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and then

for

Post by David
each object it will try to get the public url using the corresponding

fog

Post by David
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but not
with a reasonable large amount of Movie objects (let's say 100 => 100
requests to S3 have to be made to get the urls).
The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if it is
possible with Fog?
I know, I could build the public url myself without using Fog. I could

get

Post by David
the url of the bucket with public_url of the

Fog::Storage::AWS::Directory

Post by David
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of

code

Post by David
is coupled with how S3 objects are organised. I'd like to keep the code
"provider agnostic" as much as possible. If we change from S3 to another
provider, it's only a matter of storage configuration. That's why we are
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

geemus (Wesley Beary)

2014-11-03 15:02:17 UTC

Permalink

I think the non-signed urls can be contructed as "https://#{bucket_name}.
s3.amazonaws.com/#{object_name}" or something akin to that, should you
desire the non-signed versions. Perhaps we should more explicitly have
options for constructing those for users though.

Post by David
Hi there,
Thanks for you answers!
I think everything was clear as I understood that fog is more cautious
than the was sdk.
My question was simply if there was a way to get the public url without
all checks.
As advised, I used the underlying helper to get the url: I used
#request_url of Fog::Storage::AWS::Real
It's nice to know the difference between #get and #new. The only downside
I see with #get_https_url is that there is no way (from what I see) to
get the (non signed) public url.
In my scenario, I'm returning urls of public resources that can be cached
by some clients (other backends or js apps). It makes more sense in this
case to use simple (non signed) public urls.
Anyway, thanks again for your answers,
Regards
David

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file exists and so on.
Fred

rails

Post by David
API backend for movies. Each movie is linked to a poster image which is
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the name

Post by David
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and

then for

Post by David
each object it will try to get the public url using the corresponding

fog

Post by David
possible with Fog?
I know, I could build the public url myself without using Fog. I could

get

Post by David
the url of the bucket with public_url of the

Fog::Storage::AWS::Directory

Post by David
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of

code

Post by David
is coupled with how S3 objects are organised. I'd like to keep the code
"provider agnostic" as much as possible. If we change from S3 to

another

Post by David
provider, it's only a matter of storage configuration. That's why we

are

Post by David
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to the Google Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

David

2014-11-04 08:00:06 UTC

Permalink

Yep, I know I could do something like that but I'd like to keep the code
"provider agnostic". This way if we change from aws s3 to let's say,
rackspace files, it's just a matter of configuring the storage object
properly.

Thinking out loud: #get_http_url and #get_https_url could maybe accept
the :public => true option in order to build non signed urls.

Post by geemus (Wesley Beary)
I think the non-signed urls can be contructed as "https://#{bucket_name}.
s3.amazonaws.com/#{object_name}
<http://s3.amazonaws.com/#%7Bobject_name%7D>" or something akin to that,
should you desire the non-signed versions. Perhaps we should more
explicitly have options for constructing those for users though.

Post by geemus (Wesley Beary)
Yeah, I think Fred hit the nail on the head.
directories/files#get is explicitly a call to fetch info, whereas #new
is simply a call to create a local reference
Similarly, the #public_url method on file (for better or worse) was made
to be cautious and accurate (so it checks that the file exists before
giving a possibly bogus url). In that case, if you drop down to the helper
that generates the actual URL it will also avoid the calls, which should
hopefully give you what you need.
Sorry for any confusion or lack of clarity there. If you have
suggestions about how we could better communicate that we would love to
hear them (and any documentation you might contribute would be awesome).
Thanks!
wes

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file
exists and so on.
Fred

Post by David
Hello!
I have a small issue with this fantastic gem (great work!).
Here is a small script to get the public url of an object stored in
require "fog"
# Fires one request
storage = Fog::Storage.new(
:provider => "AWS",
:aws_access_key_id => ENV["AWS_KEY"],
:aws_secret_access_key => ENV["AWS_SECRET"],
:region => "eu-west-1"
)
# Fires one request
d = storage.directories.get(ENV["AWS_BUCKET"])
# Fires one request
d.files.get("A.txt").public_url
As you can see, this script will fire 3 requests to S3.
require "aws"
# No request fired
s3 = AWS::S3.new(
:access_key_id => ENV['AWS_KEY'],
:secret_access_key => ENV['AWS_SECRET']
)
# No request fired
b = s3.buckets[ENV["AWS_BUCKET"]]
# No request fired
b.objects["A.txt"].public_url.to_s
There is not a single request fired. I guess that the idea behind
don't hit S3 until you really, really need to.
My main issue is the request fired to get the public_url of an object.
Let me explain it with an example: let's pretend we are building a

rails

Post by David
API backend for movies. Each movie is linked to a poster image which

Post by David
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the

name of

Post by David
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and

then for

Post by David
each object it will try to get the public url using the corresponding

fog

Post by David
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but

not

Post by David
with a reasonable large amount of Movie objects (let's say 100 => 100
requests to S3 have to be made to get the urls).
The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if it

Post by David
possible with Fog?
I know, I could build the public url myself without using Fog. I

could get

Post by David
the url of the bucket with public_url of the

Fog::Storage::AWS::Directory

Post by David
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of

code

Post by David
is coupled with how S3 objects are organised. I'd like to keep the

code

Post by David
"provider agnostic" as much as possible. If we change from S3 to

another

Post by David
provider, it's only a matter of storage configuration. That's why we

are

Post by David
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to the Google Groups
"ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

geemus (Wesley Beary)

2014-11-04 14:32:49 UTC

Permalink

Yeah, or perhaps something like #get_public_http_url even. I definitely
agree being more agnostic is a good plan. If you'd like to file an issue
around this we can discuss some more and try to make a plan of attack
(otherwise these email threads can get lost in the noise sometimes). Thanks!

Post by David
Yep, I know I could do something like that but I'd like to keep the code
"provider agnostic". This way if we change from aws s3 to let's say,
rackspace files, it's just a matter of configuring the storage object
properly.
Thinking out loud: #get_http_url and #get_https_url could maybe accept
the :public => true option in order to build non signed urls.

Post by David
Hi there,
Thanks for you answers!
I think everything was clear as I understood that fog is more cautious
than the was sdk.
My question was simply if there was a way to get the public url without
all checks.
As advised, I used the underlying helper to get the url: I used
#request_url of Fog::Storage::AWS::Real
It's nice to know the difference between #get and #new. The only
downside I see with #get_https_url is that there is no way (from what I
see) to get the (non signed) public url.
In my scenario, I'm returning urls of public resources that can be
cached by some clients (other backends or js apps). It makes more sense in
this case to use simple (non signed) public urls.
Anyway, thanks again for your answers,
Regards
David

Post by geemus (Wesley Beary)
Yeah, I think Fred hit the nail on the head.
directories/files#get is explicitly a call to fetch info, whereas #new
is simply a call to create a local reference
Similarly, the #public_url method on file (for better or worse) was
made to be cautious and accurate (so it checks that the file exists before
giving a possibly bogus url). In that case, if you drop down to the helper
that generates the actual URL it will also avoid the calls, which should
hopefully give you what you need.
Sorry for any confusion or lack of clarity there. If you have
suggestions about how we could better communicate that we would love to
hear them (and any documentation you might contribute would be awesome).
Thanks!
wes
On Tue, Oct 28, 2014 at 9:00 AM, Frederick Cheung <

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file
exists and so on.
Fred

object.

Post by David
Let me explain it with an example: let's pretend we are building a

rails

Post by David
API backend for movies. Each movie is linked to a poster image which

Post by David
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the

name of

Post by David
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and

then for

Post by David
each object it will try to get the public url using the

corresponding fog

Post by David
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but

not

it is

Post by David
possible with Fog?
I know, I could build the public url myself without using Fog. I

could get

Post by David
the url of the bucket with public_url of the

Fog::Storage::AWS::Directory

Post by David
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind of

code

Post by David
is coupled with how S3 objects are organised. I'd like to keep the

code

Post by David
"provider agnostic" as much as possible. If we change from S3 to

another

Post by David
provider, it's only a matter of storage configuration. That's why we

are

Post by David
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to the Google
Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

David

2014-11-12 10:17:03 UTC

Permalink

Yup, I created an issue on github: https://github.com/fog/fog/issues/3263

Post by geemus (Wesley Beary)
Yeah, or perhaps something like #get_public_http_url even. I definitely
agree being more agnostic is a good plan. If you'd like to file an issue
around this we can discuss some more and try to make a plan of attack
(otherwise these email threads can get lost in the noise sometimes). Thanks!

Post by geemus (Wesley Beary)
I think the non-signed urls can be contructed as "https://#{bucket_name}.
s3.amazonaws.com/#{object_name}
<http://s3.amazonaws.com/#%7Bobject_name%7D>" or something akin to
that, should you desire the non-signed versions. Perhaps we should more
explicitly have options for constructing those for users though.

Post by David
Hi there,
Thanks for you answers!
I think everything was clear as I understood that fog is more cautious
than the was sdk.
My question was simply if there was a way to get the public url without
all checks.
As advised, I used the underlying helper to get the url: I used
#request_url of Fog::Storage::AWS::Real
It's nice to know the difference between #get and #new. The only
downside I see with #get_https_url is that there is no way (from what
I see) to get the (non signed) public url.
In my scenario, I'm returning urls of public resources that can be
cached by some clients (other backends or js apps). It makes more sense in
this case to use simple (non signed) public urls.
Anyway, thanks again for your answers,
Regards
David

Post by geemus (Wesley Beary)
Yeah, I think Fred hit the nail on the head.
directories/files#get is explicitly a call to fetch info, whereas #new
is simply a call to create a local reference
Similarly, the #public_url method on file (for better or worse) was
made to be cautious and accurate (so it checks that the file exists before
giving a possibly bogus url). In that case, if you drop down to the helper
that generates the actual URL it will also avoid the calls, which should
hopefully give you what you need.
Sorry for any confusion or lack of clarity there. If you have
suggestions about how we could better communicate that we would love to
hear them (and any documentation you might contribute would be awesome).
Thanks!
wes
On Tue, Oct 28, 2014 at 9:00 AM, Frederick Cheung <

d = storage.directories.new(:key => bucket)
d.files.get_https_url("example.txt", 300)
shouldn't fire any requests although that generates a signed url
The other approaches do seem to validate things like whether the file
exists and so on.
Fred

object.

Post by David
Let me explain it with an example: let's pretend we are building a

rails

Post by David
API backend for movies. Each movie is linked to a poster image

which is

Post by David
stored in S3 (as a public read only object).
Now for the index action, I want the backend to return simply the

name of

Post by David
the movie and the url of the poster image.
The issue here, is that the backend will get the Movie objects and

then for

Post by David
each object it will try to get the public url using the

corresponding fog

Post by David
object. This will fire a request to S3 for each movie.
As expected this works well for a small number of Movie objects but

not

Post by David
with a reasonable large amount of Movie objects (let's say 100 =>

100

Post by David
requests to S3 have to be made to get the urls).
The question is therefore: can we avoid this request when calling
public_url on a Fog::Storage::AWS::File object? I was wondering if

it is

Post by David
possible with Fog?
I know, I could build the public url myself without using Fog. I

could get

Post by David
the url of the bucket with public_url of the

Fog::Storage::AWS::Directory

Post by David
object and then, build the public url of the object using String
concatenation/interpolation. The only downside is that, this kind

of code

Post by David
is coupled with how S3 objects are organised. I'd like to keep the

code

Post by David
"provider agnostic" as much as possible. If we change from S3 to

another

Post by David
provider, it's only a matter of storage configuration. That's why

we are

Post by David
using Fog instead of aws sdk.
Thanks in advance for any answer.
Regards,
David
--
You received this message because you are subscribed to the Google

Groups "ruby-fog"

Post by David
group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to the Google Groups
"ruby-fog" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.