So a strategy could be to create a new light repo with only a copy of the files of interest from the sdk with our commits added on top of it. This would decrease download times not only for users but also (more important) for CIs.
Relatively easy to create and keep it synchronized using git subtree splithttps://github.com/bgallois/light-polkadot-sdk. We might consider switching to a lighter repository upon the next polkadot upgrade.
One thing that we also need to check is that running cargo update outputs Updating git repository 'https://github.com/paritytech/polkadot-sdk'. Even if all your dependencies point to your polkadot-sdk fork, it seems that the upstream repository is also being downloaded.
I did not see that we were also pulling from paritytech. That's strange, we should have a check on the dependencies.
In addition of decreasing the size of the files themselves for this repo, we should also avoid cloning the whole history for all repos. It seems that cargo will be able to do shallow clone (https://github.com/rust-lang/cargo/issues/1171).
Yes, it is quite unsettling because the paritytech/polkadot-sdk can be seen in the Cargo.lock file but not in the cargo tree. This discrepancy originates from the polkadot-sdk part that is not substrate because using the light-polkadot-sdk mentioned earlier removes it.