Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Congratulations for greatly reproducing #77

Open
HeathHose opened this issue Nov 15, 2019 · 12 comments
Open

Congratulations for greatly reproducing #77

HeathHose opened this issue Nov 15, 2019 · 12 comments

Comments

@HeathHose
Copy link

HeathHose commented Nov 15, 2019

Thanks for your greate AutoML! Could you pls release the architect found in search?

@HeathHose HeathHose changed the title Congratulations for greatly reproduce Congratulations for greatly reproducing Nov 15, 2019
@HeathHose
Copy link
Author

HeathHose commented Nov 15, 2019

Is search config used in obtain_search_args?
And I found that AutoML is a little consistent with official_deeplab as follows

    if prev_layer is None:
      prev_layer = net

AutoML is as follows:

if prev_prev_fmultiplier == -1 and j == 0:
    op = None
  • AutoML ignored s0 which some corner cell don't has while official_deeplab copy s1 as s0
  • Well, tensorflow code is only for test the architect, not search architect. But I think the above code is not written casually.

So I email the author and the reply is as follows:

For those cells, I instead used the tensor with the most similar spatial size. Typically this means 4x larger tensor for l-2, and 2x larger tensor for l-1. More specifically, when 8H1 preprocesses stem0, it will use stride 4, and when it preprocesses stem1, it will use stride 2.

Yuan Fang, how do you see?

@zhizhangxian
Copy link
Collaborator

Thanks, We will fix it just like the author reports!

@mrluin
Copy link

mrluin commented Nov 19, 2019

Is search config used in obtain_search_args?
And I found that AutoML is a little consistent with official_deeplab as follows

    if prev_layer is None:
      prev_layer = net

AutoML is as follows:

if prev_prev_fmultiplier == -1 and j == 0:
    op = None
  • AutoML ignored s0 which some corner cell don't has while official_deeplab copy s1 as s0
  • Well, tensorflow code is only for test the architect, not search architect. But I think the above code is not written casually.

So I email the author and the reply is as follows:

For those cells, I instead used the tensor with the most similar spatial size. Typically this means 4x larger tensor for l-2, and 2x larger tensor for l-1. More specifically, when 8H1 preprocesses stem0, it will use stride 4, and when it preprocesses stem1, it will use stride 2.

Yuan Fang, how do you see?

@HeathHose
Nice reply from authors, but what the authors replied confused me a lot.
Could you give me an example to process the case of when prev_prev_c (s0) is None?
What does use the tensor with the most similar spatial size mean, or just use the copy of prev_c (s1)?
Thanks in advance!

@HeathHose
Copy link
Author

@mrluin for example, 8H1 is the first node in level 8 so that it don't have s0 and s1. So the author preprocess stem0 and stem1 in stride 2,4 as s0 and s1 respectively

@NoamRosenberg
Copy link
Owner

NoamRosenberg commented Nov 19, 2019

@mrluin hi, I've bounced back and forth about how is best to do this. The authors doing it this way doesn't automatically make it the best approach. I think there may be a better way. However, It wont be implemented soon, and if you think you can improve on our results please try to do so. Should you get better results I would love for you to make a pull request and become a contributer to the project.

@HankKung
Copy link
Collaborator

@mrluin hi, I've bounced back and forth about how is best to do this. The authors doing it this way doesn't automatically make it the best approach. I think there may be a better way. However, It wont be implemented soon, and if you think you can improve on our results please try to do so. Should you get better results I would love for you to make a pull request and become a contributer to the project.

I'm getting on it and see how the result will be. I'll report the result here either it gets better or not.

@HankKung
Copy link
Collaborator

HankKung commented Nov 27, 2019

螢幕快照 2019-11-27 上午7 49 43

architecture search results: [1 1 2 2 2 3 3 2 1 2 3 3]
new cell structure: [[ 0 4]
[ 1 4]
[ 4 5]
[ 2 4]
[ 7 5]
[ 6 5]
[13 5]
[ 9 4]
[17 5]
[18 4]]

This is the one I've added pre_pre_input for those edge tensors. Despite the mIoU slightly decreases, the stem has only two layers (to keep consistent with the paper). In addition, all downsampling are implemented with stride 2 and 4.

(edited)As we can see that the search had not converged yet because the cell structures derived are all sep conv, not even one delated conv, which shouldn't be. But I think padding pre_pre_input for those edge tensors is necessary.

Searching with a larger epochs number is a straightforward option. An alternative is to boost the robustness of the search, I recently found some interesting research on it: https://openreview.net/pdf?id=H1gDNyrKDS.

@mrluin
Copy link

mrluin commented Dec 2, 2019

@HankKung Hi, Thanks for your hard work!

This is the one I've added pre_pre_input for those edge tensors.

Does the way you add pre_pre_input for those edge tensors like the following one? And how do you perform downsampling with stride4, also use the FactorizedReduce?

add pre_pre_input for the first nodes in each level (4, 8, 16, 32)
-------------------------------------------
level-node | pre_pre_input | pre_input
-------------------------------------------
4-2        output of stem0  output of stem1
8-1        output of stem0  output of stem1
8-2        output of stem1  output of 8-1
16-1       output of stem1  output of 8-1
16-2       output of 8-1    output of 16-1  
32-1       output of 8-1    output of 16-1
32-2       output of 16-1   output of 32-1

But after I read the official code of autodeeplab (derived model), I found that the stem has three conv_layers rather than two.

And how do you implement the derived model, the same as the official code? (if pre_pre_input is None, consider it as a copy of the pre_input)

Looking forward to your reply!

@HankKung
Copy link
Collaborator

HankKung commented Dec 2, 2019

@HankKung Hi, Thanks for your hard work!

This is the one I've added pre_pre_input for those edge tensors.

Does the way you add pre_pre_input for those edge tensors like the following one? And how do you perform downsampling with stride4, also use the FactorizedReduce?

add pre_pre_input for the first nodes in each level (4, 8, 16, 32)
-------------------------------------------
level-node | pre_pre_input | pre_input
-------------------------------------------
4-2        output of stem0  output of stem1
8-1        output of stem0  output of stem1
8-2        output of stem1  output of 8-1
16-1       output of stem1  output of 8-1
16-2       output of 8-1    output of 16-1  
32-1       output of 8-1    output of 16-1
32-2       output of 16-1   output of 32-1

But after I read the official code of autodeeplab (derived model), I found that the stem has three conv_layers rather than two.

And how do you implement the derived model, the same as the official code? (if pre_pre_input is None, consider it as a copy of the pre_input)

Looking forward to your reply!

About the stem, they used a two-layer one during the search and used a three-layer one during the weights training (retrain) as you mentioned.
Yes, I used FactorizedReduce and DoubleFactorizedReduce for stride 2 and 4 respectively.
螢幕快照 2019-12-02 下午8 25 36
Since aiming to consistent with the paper, the stem I implemented is only with 2-layer, therefore, the pre_pre_inputs for the nodes are almost the same as the one you described. The only difference is I also add the output of stem0 for node 4-1

I haven't evaluated the performance of the derived model, but it's supposed the pre_pre_inputs always exist because the pre_pre_input is simply the last two node's output (e.g., stem2's output as level 1's pre_pre_input), no need for the same spatial resolution one.

Glad to help! If you have any idea or questions, we are happy to discuss.

@mrluin
Copy link

mrluin commented Dec 2, 2019

Oh, that's very helpful!
I made a mistake in the above list. In the case of the two-layer stem:

4-1 output of stem0, output of stem1
4-2 output of stem1, output of 4-1

Now, I think I got your idea of this repetition. Thank you very much!

@NdaAzr
Copy link

NdaAzr commented May 5, 2020

@HankKung Thank you for providing the clarification about the network. I am wondering what does each number mean in the output of cell structure? or how we can print(genotype) instead of decode?
For example, here is the output of the new cell structure (genotype), what does each number mean? Greatly appreciated if you could provide some help.

new cell structure: [[ 1  5]
 [ 0  4]
 [ 2  4]
 [ 3  0]
 [ 5  4]
 [ 7  4]
 [11  7]
 [12  4]
 [17  4]
 [18  2]]

@sdszqs
Copy link

sdszqs commented Mar 19, 2021

@HankKung Thank you for providing the clarification about the network. I am wondering what does each number mean in the output of cell structure? or how we can print(genotype) instead of decode?
For example, here is the output of the new cell structure (genotype), what does each number mean? Greatly appreciated if you could provide some help.

new cell structure: [[ 1  5]
 [ 0  4]
 [ 2  4]
 [ 3  0]
 [ 5  4]
 [ 7  4]
 [11  7]
 [12  4]
 [17  4]
 [18  2]]

hey,I have the same problem with you ,Have you found the answer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants