Skip to content

Conversation

@AguTrachta
Copy link
Contributor

@AguTrachta AguTrachta commented Jun 30, 2025

This implementation replaces the legacy dnsmasq-based system and is built to integrate cleanly with odhcpd and the shared-state-async daemon.

When odhcpd assigns or renews a lease, it executes the script defined in the leasetrigger option.

The trigger script (shared-state-publish_odhcpd_leases) is called. This script:

  • Fetches the full, up-to-date list of IPv4 leases directly from the odhcpd daemon via ubus.
  • Converts the lease data into a clean JSON map with the format { "IP_ADDRESS": { "mac": "...", "hostname": "..." } }.
  • Uses shared-state-async insert to rapidly hand off this data to the odhcpd-leases CRDT.

The shared-state hook (shared-state-generate_odhcpd_leases) is executed on the remote node. This script:

  • Reads the full lease data for the network, which is provided as a JSON array.
  • Filters out any leases that were published by the node itself, preventing redundant self-configuration.
  • Writes the list of remote leases to /tmp/ethers.mesh in the standard MAC IP format.
  • Creates a symlink from /etc/ethers to /tmp/ethers.mesh

This is an initial implementation, there are areas of improvement like IPv6 support. Feedback, suggestions for improvement, and collaborative testing are highly encouraged and appreciated as we work towards finalizing this feature.

@javierbrk
Copy link
Collaborator

Very good ! ! in fact amazing !

Just a few comments.

Shared-state has been trough a review and reimplementation. This is still a work in progress.
Documentation has been added into the Readme of the shared state async package.

The implementation you made seems to be based on the old format. Please update it to the new way.
Take shared-state-ref_state_commons as an example

I think that the Readme from shared-state-async may have some improvements. So feel free to propose them.

Also the info from this pull request can be a good readme for the package.
In this readme it can be useful to find some instructions on how to manually test this. And if posible think or implement some unit tests using docker or test to be performed using virtualmachines.

Copy link
Collaborator

@javierbrk javierbrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@G10h4ck can yo take a look for me it seems to be ok

@AguTrachta AguTrachta marked this pull request as ready for review July 9, 2025 00:06
@AguTrachta AguTrachta requested a review from javierbrk July 9, 2025 19:27
@ilario
Copy link
Member

ilario commented Jul 23, 2025

Hi!
I did some coarse testing, here go some comments.

I installed odhcpd and then this package on a router that was already flashed, so it seems that the /etc/ethers file was already there and the link was not created by this line:
[ -e /etc/ethers ] || ln -s "$LEASEFILE" /etc/ethers

The TRIGGERFILE variable seems missing a /, so that in my /etc/config/dhcp I can see this leasetrigger option, that seems broken:

config odhcpd 'odhcpd'
	option maindhcp '1'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger 'usr/share/shared-state/publishers/shared-state-publish_odhcpd_leases'
	option loglevel '4'

I disabled and stopped dnsmasq, but now I fear it was the wrong thing to do...? So do we still need dnsmasq? (Actually it makes sense, as the package "dnsmasq" is selected by default in the buildroot. What we were adding, and hopefully will not be nneded anymore after this PR is the "dnsmasq-dhcpv6")
For example, I see that inside /usr/sbin/odhcpd-update there is a mention to dnsmasq.

-L verifies if file exists and if it is already a symlink and now routers that already have a regular file get it force-replaced by the mesh lease symlink.
@AguTrachta AguTrachta force-pushed the feature/odhcpd-leases branch from f239950 to 81b6842 Compare July 24, 2025 00:24
@AguTrachta
Copy link
Contributor Author

I've corrected those two errors you mentioned. (I've got a problem with some last commit when I updated this branch with the master that's why I forced pushed to reverse the last commit, please check if it's all right)

About dnsmasq, I think we need that package, because we only replace the DHCP from it (with option maindhcp '1')

odhcpd-update sends a SIGHUP to dnsmasq so IP mappings are refreshed and rebuild its internal DNS table.

@ilario
Copy link
Member

ilario commented Jul 24, 2025

I tried to compile an image using the OpenWrt BuildRoot and using the lime-packages feed including the new package from this pull request.

By default, the BuildRoot selects odhcpd-ipv6only (see https://github.com/openwrt/openwrt/blob/19bc6e8c7f86b197b7bccc45c6af4a3ff521e8ee/include/target.mk#L58), but shared-state-odhcpd-leases depends on odhcpd and this causes this error:

 * check_data_file_clashes: Package odhcpd-ipv6only wants to install file /home/user/openwrt24/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-ipq40xx/etc/init.d/odhcpd
        But that file is already provided by package  * odhcpd
[...]
 * opkg_install_cmd: Cannot install package odhcpd-ipv6only.
make[2]: *** [package/Makefile:99: package/install] Error 255

So looks like we should depend on odhcpd-ipv6only also, in order to be as compatible as possible with pristine OpenWrt, no?

@AguTrachta
Copy link
Contributor Author

Interesting, as you can see here, full odhcpd package supports DHCPv6, so, we need odhcpd-ipv6only?

If you think this is right, we should mention that also in development section where there is an advice of deselecting this package because it can be problematic.

@ilario
Copy link
Member

ilario commented Jul 27, 2025

Interesting, as you can see here, full odhcpd package supports DHCPv6, so, we need odhcpd-ipv6only?

I fear I did a big mistake: I assumed that OpenWrt was using odhcpd both for IPv4 and for IPv6, but as they include by default dnsmasq and odhcpd-ipv6only (in the same file linked above: https://github.com/openwrt/openwrt/blob/19bc6e8c7f86b197b7bccc45c6af4a3ff521e8ee/include/target.mk#L53-L58) it looks like they are using dnsmasq as IPv4 DHCP server also?

If this is the case, this package is still very good for sharing the odhcpd leases, but we will still need also shared-state-dnsmasq_leases even when going with OpenWrt default packages selection (which would be amazing, at least in my opinion), right?

So (still assuming that this is the case) the question is: why does lime-proto-anygw depend on dnsmasq-dhcpv6?

DEPENDS:=+dnsmasq-dhcpv6 +kmod-nft-bridge +libuci-lua \

Can't we just stop recommending people to deselect dnsmasq and odhcpd-ipv6only and live happy?
@G10h4ck do you have idea?

Regarding the dependencies to use in the Makefile of shared-state-odhcpd_leases: do you think would be ok to not include neiter odhcpd nor odhcpd-ipv6only, so that people can select what they prefer? Are there going to be problems and errors if no odhcpd at all is present?

Additional info: with dnsmasq we are not managing the sync of the IPv6 leases.
#975

Also, this PR refers to #1189 and #294.

If you think this is right, we should mention that also in development section where there is an advice of deselecting this package because it can be problematic.

My plan was to just delete these lines:

Deselect problematic packages:
Base system → dnsmasq
Network → odhcpd-ipv6only

@AguTrachta
Copy link
Contributor Author

This package won't work without full odhcpd binary installed because of ubus call dhcp ipv4leases

I founded something useful while digging into dnsmasq-dhcpv6. This variant adds DHCPv6 support to the normal dnsmasq binary, one single daemon can hand out both IPv4 and IPv6 leases on each interface.
In today’s LibreMesh layout (one VLAN per port for Babel) we actually run two separate processes on every VLAN? dnsmasq for IPv4 and odhcpd-ipv6only for IPv6.

This relates with the next task about removing VLAN from Babel interfaces, this explanation will be improved in an issue, but I think it was worth the mention

@AguTrachta
Copy link
Contributor Author

Just a few comments of dnsmasq-dhcpv6

LibreMesh’s current base image installs dnsmasq-dhcpv6 as the only DHCP daemon and does not ship odhcpd at all, you can verify it with opkg list-installed | grep dnsmasq on any flashed node. dnsmasq-dhcpv6 is the full-featured build of dnsmasq that handles DNS + DHCPv4 + DHCPv6/RA in one process.

Inside /etc/init.d/dnsmasq you can se this code:

if [ -x /usr/sbin/odhcpd ] && [ -x /etc/init.d/odhcpd ]; then
    config_get odhcpd_is_main odhcpd maindhcp 0
    /etc/init.d/odhcpd enabled && odhcpd_is_enabled=1 || odhcpd_is_enabled=0
    if [ "$odhcpd_is_enabled" -eq 0 ] && [ "$DHCPv6CAPABLE" -eq 1 ]; then
    elif [ "$odhcpd_is_main" -gt 0 ]; then
    fi
fi

Because odhcpd is absent, the first branch hits, and dnsmasq starts with both --dhcp-range (v4) and --enable-ra/--dhcp-range=:: (v6)
You can see community discussing this topics in forums

In the man page of dnsmasq you can see that with the directive enable-ra, dnsmasq begins answering Router-Solicit packets and periodically multicasts Router Advertisements. The RA flags you set in the dhcp-range decide how clients build their IPv6, it seems that by default it uses the SLAAC system, that is an IPv6 feature where every host combines the advertised /64 with a locally generated Interface ID to make a globally unique address.

 ra-only tells dnsmasq to offer Router Advertisement only on this subnet, and not DHCP.

slaac tells dnsmasq to offer Router Advertisement on this subnet and to set the A bit in the router advertisement, so that the client will use SLAAC addresses. When used with a DHCP range or static DHCP address this results in the client having both a DHCP-assigned and a SLAAC address.

ra-stateless sends router advertisements with the O and A bits set, and provides a stateless DHCP service. The client will use a SLAAC address, and use DHCP for other configuration information.

ra-names enables a mode which gives DNS names to dual-stack hosts which do SLAAC for IPv6. Dnsmasq uses the host's IPv4 lease to derive the name, network segment and MAC address and assumes that the host will also have an IPv6 address calculated using the SLAAC algorithm, on the same network segment. The address is pinged, and if a reply is received, an AAAA record is added to the DNS for this IPv6 address. Note that this is only happens for directly-connected networks, (not one doing DHCP via a relay) and it will not work if a host is using privacy extensions. ra-names can be combined with ra-stateless and slaac.

ra-advrouter enables a mode where router address(es) rather than prefix(es) are included in the advertisements. This is described in RFC-3775 section 7.2 and is used in mobile IPv6. In this mode the interval option is also included, as described in RFC-3775 section 7.3. 

What I want to say with all of this info, is that its possibly an option to use full dnsmasq instead of sharing with odhcpd, because with this we are just using one daemon for DNS and DHCP together

More info:
OpenWrt wiki page detailing the dnsmasq / odhcpd split and maindhcp option
dnsmasq-full package description
OpenWrt forum thread probably confirming single-daemon setups with dnsmasq-dhcpv6?

I hope that all this info helps us to understand better this situation!

@ilario
Copy link
Member

ilario commented Jul 31, 2025

This package won't work without full odhcpd binary installed because of ubus call dhcp ipv4leases

Then it has to depend on odhcpd, you are right :)

What I want to say with all of this info, is that its possibly an option to use full dnsmasq instead of sharing with odhcpd, because with this we are just using one daemon for DNS and DHCP together

Thanks for doing all this research!

What you says makes a lot of sense, and it is what is currently happening in LibreMesh.

Yet, OpenWrt people are using dnsmasq for IPv4 DHCP and odhcpd-ipv6only for IPv6 DHCP-RA. Looking on the internet, seems that they prefer odhcpd for IPv6 because it is the amazing solution made by OpenWrt that is perfectly integrated in OpenWrt, but they did not stop using dnsmasq for IPv4 because of some features missing in odhcpd for IPv4. For example the ones listed here: openwrt/odhcpd#19

And maybe we also need those features of dnsmasq, for example this one looks like using it:

uci:set("dhcp", owrtInterfaceName, "dhcp_option", { "option:mtu,"..anygw.SAFE_CLIENT_MTU } )

Going 100% on dnsmasq-dhcpv6, like we do now, makes a lot of sense, but gets us away from standard OpenWrt. The advantage of using standard OpenWrt, if I am not mistaken, if that we could use their ImageBuilder just adding our binary repositories and that we could recommend an additional (and much safer) install method for LibreMesh via flashing a clean OpenWrt image and then installing the LibreMesh packages via opkg. The main place where to discuss this is #1000. Anyway, please, @a-gave can you confirm if this is right?

Opinions?

@a-gave
Copy link
Contributor

a-gave commented Aug 1, 2025

My plan was to just delete these lines:

Deselect problematic packages:
Base system → dnsmasq
Network → odhcpd-ipv6only

that we could recommend an additional (and much safer) install method for LibreMesh via flashing a clean OpenWrt image and then installing the LibreMesh packages via opkg

This would be ideal, imho. But to avoid this kind of errors #1199 (comment) I guess we should stick with the default packages from openwrt: dnsmasq (dns + dhcpv4) and odhcpd_ipv6only (dhcpv6/ra).
Maybe related to this #375

we could use their ImageBuilder just adding our binary repositories

Yes this is already doable/done, then in the package selection we are adjusting packages prefixing an hyphen to remove -dnsmasq and -odhcpd_ipv6only.
But as a minor note: using the same packages from openwrt (dnsmasq and odhcpd_ipv6only) should simplify also building with the buildroot. Currently because of the removal of these packages if you previously built an openwrt for a device then you should do a make clean before rebuilding libremesh.

@G10h4ck
Copy link
Member

G10h4ck commented Aug 15, 2025

My opinion is that we can merge this PR only at the condition that the default LibreMesh setup based on dnsmasq-dhcpv6 without odhcpd keeps working, full stop.

In LibreMesh we opted for dnsmasq-dhcpv6 because the upstream solution odhcpd lacks features that we need, I have been recurrently pushed to not use dnsmasq-dhcpv6 a bunch of times, but then every-time when I took the time to verify if all needed features where present on odhcpd something was always missing.

I don't get why there is such urge to not use dnsmasq-dhcpv6, but for me is OK if someone what to use odhcpd and live with the missing features, at the condition our dnsmasq-dhcpv6 based solution keeps working by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants